You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Starting with `v1.33`, Weaviate enables **8-bit [RQ quantization](/weaviate/configuration/compression/rq-compression) by default** when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the [`DEFAULT_QUANTIZATION`](/deploy/configuration/env-vars#DEFAULT_QUANTIZATION) environment variable. Note that once enabled, quantization can't be disabled for a collection.
Product quantization (PQ) is a form of data compression for vectors. PQ reduces the HNSW index's memory footprint so you can work with larger datasets. For a discussion of how PQ saves memory, see [Product quantization](/weaviate/concepts/vector-quantization#product-quantization).
1
+
[**Product quantization (PQ)**](/weaviate/concepts/vector-quantization#product-quantization) is a form of data compression for vectors. PQ reduces the HNSW index's memory footprint so you can work with larger datasets. For a discussion of how PQ saves memory, see [Product quantization](/weaviate/concepts/vector-quantization#product-quantization).
Copy file name to clipboardExpand all lines: docs/weaviate/best-practices/index.md
+2-4Lines changed: 2 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -104,11 +104,9 @@ If you have a large number of vectors, consider using vector quantization to red
104
104
105
105
For HNSW indexes, we suggest enabling [rotational quantization (RQ)](../configuration/compression/rq-compression.md) as a starting point. It provides significant memory usage benefits and almost no loss in query accuracy.
106
106
107
-
:::info Compression by Default
107
+
import CompressionByDefault from '/_includes/compression-by-default.mdx';
108
108
109
-
Starting with `v1.33`, Weaviate enables **8-bit RQ quantization by default** when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the [`DEFAULT_QUANTIZATION`](/deploy/configuration/env-vars/index.md#DEFAULT_QUANTIZATION) environment variable.
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
19
+
20
+
<CompressionByDefault/>
21
+
18
22
## What is quantization?
19
23
20
24
In general, quantization techniques reduce the memory footprint by representing numbers with lower precision numbers, like rounding a number to the nearest integer. In neural networks, quantization reduces the values of the weights or activations of the model stored as a 32-bit floating-point number (4 bytes) to a lower precision number, such as an 8-bit integer (1 byte).
Copy file name to clipboardExpand all lines: docs/weaviate/configuration/compression/bq-compression.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,11 +13,11 @@ import TSCodeBQOptions from '!!raw-loader!/\_includes/code/howto/configure.bq-co
13
13
import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compression.bq_test.go';
14
14
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/bq-compression.java';
15
15
16
-
:::info Added in `v1.23`
17
-
BQ is available for the [`flat` index](/weaviate/concepts/indexing/vector-index.md#flat-index) type from `v1.23` onwards and for the [`hnsw` index](/weaviate/config-refs/indexing/vector-index.mdx#hnsw-index) type from `v1.24`.
18
-
:::
16
+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
17
+
18
+
<CompressionByDefault/>
19
19
20
-
Binary quantization (BQ) is a vector compression technique that can reduce the size of a vector.
20
+
[**Binary quantization (BQ)**](/weaviate/concepts/vector-quantization#binary-quantization) is a vector compression technique that can reduce the size of a vector.
21
21
22
22
To use BQ, enable it as shown below and add data to the collection.
Copy file name to clipboardExpand all lines: docs/weaviate/configuration/compression/index.md
+2-4Lines changed: 2 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,11 +18,9 @@ To balance resource costs and system performance, consider one of these options:
18
18
19
19
You can also [disable quantization](uncompressed.md) for a collection.
20
20
21
-
:::info Compression by Default
21
+
import CompressionByDefault from '/_includes/compression-by-default.mdx';
22
22
23
-
Starting with `v1.33`, Weaviate enables **8-bit RQ quantization by default** when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the [`DEFAULT_QUANTIZATION`](/deploy/configuration/env-vars/index.md#DEFAULT_QUANTIZATION) environment variable.
Copy file name to clipboardExpand all lines: docs/weaviate/configuration/compression/rq-compression.md
+19-13Lines changed: 19 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,32 +12,29 @@ import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compre
12
12
import TSCode from '!!raw-loader!/\_includes/code/howto/configure-rq/rq-compression-v3.ts';
13
13
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/rq-compression.java';
14
14
15
-
:::info Added in `v1.32`
16
-
17
-
**8-bit Rotational quantization (RQ)** was added in **`v1.32`**.
18
-
19
-
:::
20
-
21
-
:::caution Preview
15
+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
22
16
23
-
**1-bit Rotational quantization (RQ)** was added in **`v1.33`** as a **preview**.<br/>
24
-
25
-
This means that the feature is still under development and may change in future releases, including potential breaking changes.
26
-
**We do not recommend using this feature in production environments at this time.**
27
-
28
-
:::
17
+
<CompressionByDefault/>
29
18
30
19
[**Rotational quantization (RQ)**](../../concepts/vector-quantization.md#rotational-quantization) is a fast vector compression technique that offers significant performance benefits. Two RQ variants are available in Weaviate:
31
20
32
21
-**8-bit RQ**: Up to 4x compression while retaining almost perfect recall (98-99% on most datasets). **Recommended** for most use cases.
33
22
-**1-bit RQ**: Close to 32x compression as dimensionality increases with moderate recall across various datasets.
34
23
35
24
:::note HNSW only
25
+
36
26
RQ is currently not supported for the flat index type.
27
+
37
28
:::
38
29
39
30
## 8-bit RQ
40
31
32
+
:::info Added in `v1.32`
33
+
34
+
**8-bit Rotational quantization (RQ)** was added in **`v1.32`**.
35
+
36
+
:::
37
+
41
38
[8-bit RQ](../../concepts/vector-quantization.md#8-bit-rq) provides up-to 4x compression while maintaining 98-99% recall in internal testing. It is generally recommended for most use cases as the default quantization techniques.
42
39
43
40
### Enable compression for new collection
@@ -112,6 +109,15 @@ RQ can also be enabled for an existing collection by updating the collection def
112
109
113
110
## 1-bit RQ
114
111
112
+
:::caution Preview
113
+
114
+
**1-bit Rotational quantization (RQ)** was added in **`v1.33`** as a **preview**.<br/>
115
+
116
+
This means that the feature is still under development and may change in future releases, including potential breaking changes.
117
+
**We do not recommend using this feature in production environments at this time.**
118
+
119
+
:::
120
+
115
121
[1-bit RQ](../../concepts/vector-quantization.md#1-bit-rq) is an quantization technique that provides close to 32x compression as dimensionality increases. 1-bit RQ serves as a more robust and accurate alternative to [BQ](./bq-compression.md) with only a slight performance trade-off. While more performant than PQ in terms of encoding time and distance calculations, 1-bit RQ typically offers slightly lower recall than well-tuned [PQ](./pq-compression.md).
Copy file name to clipboardExpand all lines: docs/weaviate/configuration/compression/sq-compression.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,11 +13,11 @@ import TSCodeSQOptions from '!!raw-loader!/\_includes/code/howto/configure-sq/sq
13
13
import GoCode from '!!raw-loader!/\_includes/code/howto/go/docs/configure/compression.sq_test.go';
14
14
import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/weaviate/docs/sq-compression.java';
15
15
16
-
:::info Added in v1.26.0
16
+
import CompressionByDefault from '/\_includes/compression-by-default.mdx';
17
17
18
-
:::
18
+
<CompressionByDefault/>
19
19
20
-
[Scalar quantization (SQ)](/weaviate/concepts/vector-quantization#scalar-quantization) is a vector compression technique that can reduce the size of a vector.
20
+
[**Scalar quantization (SQ)**](/weaviate/concepts/vector-quantization#scalar-quantization) is a vector compression technique that can reduce the size of a vector.
21
21
22
22
To use SQ, enable it in the collection definition, then add data to the collection.
0 commit comments