Skip to content

Commit dbd92bd

Browse files
committed
Update compression notes
1 parent 77ca4fd commit dbd92bd

10 files changed

Lines changed: 48 additions & 33 deletions

File tree

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
:::info Compression by Default
2+
3+
Starting with `v1.33`, Weaviate enables **8-bit [RQ quantization](/weaviate/configuration/compression/rq-compression) by default** when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the [`DEFAULT_QUANTIZATION`](/deploy/configuration/env-vars#DEFAULT_QUANTIZATION) environment variable. Note that once enabled, quantization can't be disabled for a collection.
4+
5+
:::
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
Product quantization (PQ) is a form of data compression for vectors. PQ reduces the HNSW index's memory footprint so you can work with larger datasets. For a discussion of how PQ saves memory, see [Product quantization](/weaviate/concepts/vector-quantization#product-quantization).
1+
[**Product quantization (PQ)**](/weaviate/concepts/vector-quantization#product-quantization) is a form of data compression for vectors. PQ reduces the HNSW index's memory footprint so you can work with larger datasets. For a discussion of how PQ saves memory, see [Product quantization](/weaviate/concepts/vector-quantization#product-quantization).

docs/weaviate/best-practices/index.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -104,11 +104,9 @@ If you have a large number of vectors, consider using vector quantization to red
104104

105105
For HNSW indexes, we suggest enabling [rotational quantization (RQ)](../configuration/compression/rq-compression.md) as a starting point. It provides significant memory usage benefits and almost no loss in query accuracy.
106106

107-
:::info Compression by Default
107+
import CompressionByDefault from '/_includes/compression-by-default.mdx';
108108

109-
Starting with `v1.33`, Weaviate enables **8-bit RQ quantization by default** when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the [`DEFAULT_QUANTIZATION`](/deploy/configuration/env-vars/index.md#DEFAULT_QUANTIZATION) environment variable.
110-
111-
:::
109+
<CompressionByDefault/>
112110

113111
:::tip Further resources
114112
- [How-to: Configure vector quantization](../configuration/compression/index.md)

docs/weaviate/concepts/vector-quantization.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,10 @@ Weaviate currently offers four vector quantization techniques:
1515
- [Scalar quantization (SQ)](#scalar-quantization)
1616
- [Rotational quantization (RQ)](#rotational-quantization)
1717

18+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
19+
20+
<CompressionByDefault/>
21+
1822
## What is quantization?
1923

2024
In general, quantization techniques reduce the memory footprint by representing numbers with lower precision numbers, like rounding a number to the nearest integer. In neural networks, quantization reduces the values of the weights or activations of the model stored as a 32-bit floating-point number (4 bytes) to a lower precision number, such as an 8-bit integer (1 byte).

docs/weaviate/configuration/compression/bq-compression.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ import TSCodeBQOptions from '!!raw-loader!/\_includes/code/howto/configure.bq-co
1313
import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compression.bq_test.go';
1414
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/bq-compression.java';
1515

16-
:::info Added in `v1.23`
17-
BQ is available for the [`flat` index](/weaviate/concepts/indexing/vector-index.md#flat-index) type from `v1.23` onwards and for the [`hnsw` index](/weaviate/config-refs/indexing/vector-index.mdx#hnsw-index) type from `v1.24`.
18-
:::
16+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
17+
18+
<CompressionByDefault/>
1919

20-
Binary quantization (BQ) is a vector compression technique that can reduce the size of a vector.
20+
[**Binary quantization (BQ)**](/weaviate/concepts/vector-quantization#binary-quantization) is a vector compression technique that can reduce the size of a vector.
2121

2222
To use BQ, enable it as shown below and add data to the collection.
2323

docs/weaviate/configuration/compression/index.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,9 @@ To balance resource costs and system performance, consider one of these options:
1818

1919
You can also [disable quantization](uncompressed.md) for a collection.
2020

21-
:::info Compression by Default
21+
import CompressionByDefault from '/_includes/compression-by-default.mdx';
2222

23-
Starting with `v1.33`, Weaviate enables **8-bit RQ quantization by default** when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the [`DEFAULT_QUANTIZATION`](/deploy/configuration/env-vars/index.md#DEFAULT_QUANTIZATION) environment variable.
24-
25-
:::
23+
<CompressionByDefault/>
2624

2725
## Multi-vector encoding
2826

docs/weaviate/configuration/compression/pq-compression.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ import TSCodeManualPQ from '!!raw-loader!/\_includes/code/howto/configure.pq-com
1313
import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compression.pq_test.go';
1414
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/pq-compression.java';
1515

16-
:::note
17-
Starting in v1.23, AutoPQ simplifies configuring PQ on new collections.
18-
:::
16+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
17+
18+
<CompressionByDefault/>
1919

2020
import PQOverview from '/\_includes/configuration/pq-compression/overview-text.mdx' ;
2121

docs/weaviate/configuration/compression/rq-compression.md

Lines changed: 19 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -12,32 +12,29 @@ import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compre
1212
import TSCode from '!!raw-loader!/\_includes/code/howto/configure-rq/rq-compression-v3.ts';
1313
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/rq-compression.java';
1414

15-
:::info Added in `v1.32`
16-
17-
**8-bit Rotational quantization (RQ)** was added in **`v1.32`**.
18-
19-
:::
20-
21-
:::caution Preview
15+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
2216

23-
**1-bit Rotational quantization (RQ)** was added in **`v1.33`** as a **preview**.<br/>
24-
25-
This means that the feature is still under development and may change in future releases, including potential breaking changes.
26-
**We do not recommend using this feature in production environments at this time.**
27-
28-
:::
17+
<CompressionByDefault/>
2918

3019
[**Rotational quantization (RQ)**](../../concepts/vector-quantization.md#rotational-quantization) is a fast vector compression technique that offers significant performance benefits. Two RQ variants are available in Weaviate:
3120

3221
- **8-bit RQ**: Up to 4x compression while retaining almost perfect recall (98-99% on most datasets). **Recommended** for most use cases.
3322
- **1-bit RQ**: Close to 32x compression as dimensionality increases with moderate recall across various datasets.
3423

3524
:::note HNSW only
25+
3626
RQ is currently not supported for the flat index type.
27+
3728
:::
3829

3930
## 8-bit RQ
4031

32+
:::info Added in `v1.32`
33+
34+
**8-bit Rotational quantization (RQ)** was added in **`v1.32`**.
35+
36+
:::
37+
4138
[8-bit RQ](../../concepts/vector-quantization.md#8-bit-rq) provides up-to 4x compression while maintaining 98-99% recall in internal testing. It is generally recommended for most use cases as the default quantization techniques.
4239

4340
### Enable compression for new collection
@@ -112,6 +109,15 @@ RQ can also be enabled for an existing collection by updating the collection def
112109

113110
## 1-bit RQ
114111

112+
:::caution Preview
113+
114+
**1-bit Rotational quantization (RQ)** was added in **`v1.33`** as a **preview**.<br/>
115+
116+
This means that the feature is still under development and may change in future releases, including potential breaking changes.
117+
**We do not recommend using this feature in production environments at this time.**
118+
119+
:::
120+
115121
[1-bit RQ](../../concepts/vector-quantization.md#1-bit-rq) is an quantization technique that provides close to 32x compression as dimensionality increases. 1-bit RQ serves as a more robust and accurate alternative to [BQ](./bq-compression.md) with only a slight performance trade-off. While more performant than PQ in terms of encoding time and distance calculations, 1-bit RQ typically offers slightly lower recall than well-tuned [PQ](./pq-compression.md).
116122

117123
### Enable compression for new collection

docs/weaviate/configuration/compression/sq-compression.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ import TSCodeSQOptions from '!!raw-loader!/\_includes/code/howto/configure-sq/sq
1313
import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compression.sq_test.go';
1414
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/sq-compression.java';
1515

16-
:::info Added in v1.26.0
16+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
1717

18-
:::
18+
<CompressionByDefault/>
1919

20-
[Scalar quantization (SQ)](/weaviate/concepts/vector-quantization#scalar-quantization) is a vector compression technique that can reduce the size of a vector.
20+
[**Scalar quantization (SQ)**](/weaviate/concepts/vector-quantization#scalar-quantization) is a vector compression technique that can reduce the size of a vector.
2121

2222
To use SQ, enable it in the collection definition, then add data to the collection.
2323

docs/weaviate/configuration/compression/uncompressed.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,11 @@ import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compre
1313
import TSCode from '!!raw-loader!/\_includes/code/howto/configure-rq/rq-compression-v3.ts';
1414
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/rq-compression.java';
1515

16-
You can opt-out of using vector quantization to compress your vector data.
16+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
17+
18+
<CompressionByDefault/>
19+
20+
You can opt-out of using vector quantization to compress your vector data.
1721

1822
## Disable compression for new collection
1923

0 commit comments

Comments
 (0)