Skip to content

Commit e137cd6

Browse files
authored
Merge pull request #402 from weaviate/v1-37/query-profiling-restore
[v1.37] Query profiling
2 parents a05652a + 679bc20 commit e137cd6

9 files changed

Lines changed: 270 additions & 0 deletions

File tree

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# START ProfileNearVector
2+
import weaviate
3+
from weaviate.classes.query import MetadataQuery
4+
5+
client = weaviate.connect_to_local()
6+
7+
collection = client.collections.get("Article")
8+
9+
response = collection.query.near_vector(
10+
near_vector=[0.1, 0.2, 0.3],
11+
limit=5,
12+
return_metadata=MetadataQuery(query_profile=True, distance=True),
13+
)
14+
15+
if response.query_profile:
16+
for shard in response.query_profile.shards:
17+
print(f"Shard: {shard.name} (node: {shard.node})")
18+
for search_type, profile in shard.searches.items():
19+
print(f" [{search_type}]")
20+
for key, value in profile.details.items():
21+
print(f" {key}: {value}")
22+
# END ProfileNearVector
23+
24+
# START ProfileBM25
25+
from weaviate.classes.query import MetadataQuery
26+
27+
collection = client.collections.get("Article")
28+
29+
response = collection.query.bm25(
30+
query="machine learning",
31+
return_metadata=MetadataQuery(query_profile=True, score=True),
32+
)
33+
34+
if response.query_profile:
35+
for shard in response.query_profile.shards:
36+
print(f"Shard: {shard.name} (node: {shard.node})")
37+
for search_type, profile in shard.searches.items():
38+
print(f" [{search_type}]")
39+
for key, value in profile.details.items():
40+
print(f" {key}: {value}")
41+
# END ProfileBM25
42+
43+
# START ProfileHybrid
44+
from weaviate.classes.query import MetadataQuery
45+
46+
collection = client.collections.get("Article")
47+
48+
response = collection.query.hybrid(
49+
query="machine learning",
50+
return_metadata=MetadataQuery(query_profile=True),
51+
limit=5,
52+
)
53+
54+
if response.query_profile:
55+
for shard in response.query_profile.shards:
56+
print(f"Shard: {shard.name} (node: {shard.node})")
57+
for search_type, profile in shard.searches.items():
58+
print(f" [{search_type}]")
59+
for key, value in profile.details.items():
60+
print(f" {key}: {value}")
61+
# END ProfileHybrid
62+
63+
# START ProfileMetadataList
64+
# You can also use list-style metadata
65+
response = collection.query.near_vector(
66+
near_vector=[0.1, 0.2, 0.3],
67+
limit=5,
68+
return_metadata=["query_profile", "distance"],
69+
)
70+
# END ProfileMetadataList
71+
72+
client.close()
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
:::info Added in `v1.36.9`
2+
:::

docs/deploy/configuration/logging.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,8 @@ QUERY_SLOW_LOG_THRESHOLD=2s
117117

118118
When enabled, queries exceeding the threshold will be logged at the configured log level, allowing you to identify and optimize slow operations.
119119

120+
For per-query timing breakdowns on demand (without configuring log thresholds), see [Query profiling](/weaviate/search/query-profile.md).
121+
120122
### Tenant Activity Logging
121123

122124
For multi-tenant collections, you can configure the log level for tenant read and write activity.

docs/deploy/configuration/monitoring.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -497,6 +497,10 @@ your uses perfectly:
497497
| [Usage](https://github.com/weaviate/weaviate/blob/master/tools/dev/grafana/dashboards/usage.json) | Obtain usage metrics, such as number of objects imported, etc. | ![Usage](./img/weaviate-sample-dashboard-usage.png "Usage") |
498498
| [Aysnc index queue](https://github.com/weaviate/weaviate/blob/main/tools/dev/grafana/dashboards/index_queue.json) | Observe index queue activity | ![Async index queue](./img/weaviate-sample-dashboard-async-queue.png "Async index queue") |
499499

500+
## Query profiling
501+
502+
For per-query performance analysis, Weaviate provides [query profiling](/weaviate/search/query-profile.md). Unlike Prometheus metrics which show aggregate performance, query profiling provides per-shard timing breakdowns for individual queries — useful for diagnosing specific slow queries.
503+
500504
## `nodes` API Endpoint
501505

502506
To get collection details programmatically, use the [`nodes`](/deploy/configuration/status.md#cluster-node-data) REST endpoint.

docs/weaviate/api/graphql/additional-properties.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,18 @@ The `score` will be the hybrid score of the result, based on the nominated [fusi
154154
The `explainScore` will be the hybrid score of the result, broken down into its vector and keyword search components. This can be used to understand why a result was scored the way it was.
155155

156156

157+
### Query profiling
158+
159+
import QueryProfileNote from '/_includes/feature-notes/query-profile.mdx';
160+
161+
<QueryProfileNote/>
162+
163+
Use `queryProfile` to get per-shard timing breakdowns for a search query. Profile data is returned on the response level (attached to the first result in GraphQL), not per object. It includes timing for vector search, keyword scoring, filter evaluation, and object retrieval, broken down by shard and cluster node.
164+
165+
In GraphQL, request `_additional { queryProfile }`. The profile is returned as a JSON string.
166+
167+
See [How-to: Query profiling](../../search/query-profile.md) for full details, available metrics, and Python examples.
168+
157169
### Classification
158170

159171
When a data-object has been <SkipLink href="/weaviate/api/rest#tag/classifications">subjected to classification</SkipLink>, you can get additional information about how the object was classified by running the following command:

docs/weaviate/more-resources/performance.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,10 @@ If you have a nested reference filter, Weaviate starts by resolving the deepest
5252

5353
A tip is to avoid deeply nested filters in the queries. Additionally, try to make your queries as restrictive as possible, because a ten-level deep query would for example not be so expensive if all levels return only a single ID. In that case only ten one ID searches need to be performed, which is a lot of searches in one query, but each search is very cheap.
5454

55+
## Profiling query performance
56+
57+
To diagnose slow queries, use [query profiling](/weaviate/search/query-profile.md) to get per-shard timing breakdowns. This shows exactly how long each phase takes — vector search, keyword scoring, filter evaluation, object retrieval — broken down by shard and cluster node.
58+
5559
## Questions and feedback
5660

5761
import DocsFeedback from '/_includes/docs-feedback.mdx';

docs/weaviate/search/basics.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -638,6 +638,10 @@ You can specify metadata fields to be returned.
638638

639639
For a comprehensive list of metadata fields, see [GraphQL: Additional properties](../api/graphql/additional-properties.md).
640640

641+
:::tip Debugging query performance
642+
Use [query profiling](./query-profile.md) to get per-shard timing breakdowns for any search query. Add `query_profile=True` to `MetadataQuery` to see exactly how long each phase takes.
643+
:::
644+
641645
## Multi-tenancy
642646

643647
If [multi-tenancy](../concepts/data.md#multi-tenancy) is enabled, specify the tenant parameter in each query.
Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
---
2+
title: Query profiling
3+
sidebar_position: 95
4+
image: og/docs/howto.jpg
5+
description: "Profile search queries to get per-shard timing breakdowns for vector search, keyword scoring, and filter evaluation."
6+
---
7+
8+
import FilteredTextBlock from '@site/src/components/Documentation/FilteredTextBlock';
9+
import PyCode from '!!raw-loader!/_includes/code/howto/search.profile.py';
10+
import QueryProfileNote from '/_includes/feature-notes/query-profile.mdx';
11+
12+
<QueryProfileNote/>
13+
14+
Query profiling provides per-shard timing breakdowns for search queries. Enable it on any search request to see how long each phase takes — vector search, keyword scoring, filter evaluation, object retrieval — broken down by shard and cluster node.
15+
16+
Profiling uses the same instrumentation as [slow query logging](/deploy/configuration/logging.md#slow-query-logging). It adds minimal overhead when enabled and zero overhead when disabled.
17+
18+
## Enable profiling
19+
20+
Add `query_profile=True` to `MetadataQuery`, or include `"query_profile"` in the metadata list:
21+
22+
<FilteredTextBlock
23+
text={PyCode}
24+
startMarker="# START ProfileNearVector"
25+
endMarker="# END ProfileNearVector"
26+
language="python"
27+
/>
28+
29+
Profile data is returned on the response object at `response.query_profile`, not on individual result objects. It represents the entire query across all shards.
30+
31+
## Supported search types
32+
33+
| Search type | Profile sections | Query methods |
34+
| :---------- | :--------------- | :------------ |
35+
| Vector search | `vector` | `near_vector`, `near_object`, `near_text`, `near_image`, etc. |
36+
| Keyword search (BM25) | `keyword` | `bm25` |
37+
| Hybrid search | `vector` + `keyword` | `hybrid` |
38+
| Object fetch | `object` | `fetch_objects` |
39+
| Any search + filters | Includes filter metrics | Add `filters` to any search |
40+
| Any search + groupBy | Profile at query level | Add `group_by` to any search |
41+
42+
### BM25 example
43+
44+
<FilteredTextBlock
45+
text={PyCode}
46+
startMarker="# START ProfileBM25"
47+
endMarker="# END ProfileBM25"
48+
language="python"
49+
/>
50+
51+
### Hybrid example
52+
53+
Hybrid search produces both `vector` and `keyword` profile sections per shard:
54+
55+
<FilteredTextBlock
56+
text={PyCode}
57+
startMarker="# START ProfileHybrid"
58+
endMarker="# END ProfileHybrid"
59+
language="python"
60+
/>
61+
62+
## Response structure
63+
64+
The profile is structured as:
65+
66+
```
67+
response.query_profile
68+
└── shards[]
69+
├── name # Shard identifier (e.g. "shard_0")
70+
├── node # Cluster node (e.g. "weaviate-0")
71+
└── searches # Dict of search type → profile
72+
├── "vector" → details: { key: value, ... }
73+
├── "keyword" → details: { key: value, ... }
74+
└── "object" → details: { key: value, ... }
75+
```
76+
77+
Each search type contains a `details` dict with string key-value pairs. The available metrics depend on the query type, index configuration, and filter usage.
78+
79+
## Available metrics
80+
81+
### General metrics
82+
83+
| Metric | Description | Present when |
84+
| :----- | :---------- | :----------- |
85+
| `total_took` | Total time for this shard's search | Always |
86+
| `objects_took` | Time retrieving objects from storage | Always |
87+
| `sort_took` | Time sorting results | When sorting is applied |
88+
89+
### Vector search metrics
90+
91+
| Metric | Description |
92+
| :----- | :---------- |
93+
| `vector_search_took` | Time spent in vector index search |
94+
| `knn_search_layer_N_took` | Per-layer HNSW graph traversal time (N = layer number) |
95+
| `knn_search_rescore_took` | Time rescoring compressed vectors (PQ/BQ/SQ) |
96+
| `hnsw_flat_search` | Whether flat (brute-force) search was used instead of HNSW (`"true"` or `"false"`) |
97+
98+
### Filter metrics
99+
100+
| Metric | Description |
101+
| :----- | :---------- |
102+
| `filters_build_allow_list_took` | Time building the filter allow-list |
103+
| `filters_ids_matched` | Number of object IDs matching the filter |
104+
105+
### BM25 keyword metrics
106+
107+
| Metric | Description |
108+
| :----- | :---------- |
109+
| `kwd_method` | BM25 scoring method used (e.g., `blockmaxwand`) |
110+
| `kwd_time` | Total BM25 scoring time |
111+
| `kwd_1_tok_time` | Query tokenization time |
112+
| `kwd_3_term_time` | Term dictionary lookup time |
113+
| `kwd_4_bmw_time` | BlockMaxWAND scoring time |
114+
| `kwd_6_res_count` | Number of results from keyword scoring |
115+
116+
## Example output
117+
118+
A hybrid search on a 3-node cluster with filters produces profiles for both vector and keyword phases on each shard:
119+
120+
```
121+
Shard: shard_abc (node: weaviate-0)
122+
[keyword]
123+
kwd_method: blockmaxwand
124+
kwd_time: 242.75µs
125+
kwd_1_tok_time: 18.291µs
126+
kwd_3_term_time: 52.083µs
127+
kwd_4_bmw_time: 156.417µs
128+
total_took: 248.833µs
129+
[vector]
130+
filters_build_allow_list_took: 31.125µs
131+
filters_ids_matched: 847
132+
knn_search_layer_0_took: 14µs
133+
objects_took: 153.542µs
134+
total_took: 198.666µs
135+
vector_search_took: 40.959µs
136+
137+
Shard: shard_def (node: weaviate-1)
138+
[keyword]
139+
kwd_method: blockmaxwand
140+
kwd_time: 189.333µs
141+
total_took: 195.25µs
142+
[vector]
143+
filters_build_allow_list_took: 27.458µs
144+
filters_ids_matched: 912
145+
total_took: 172.417µs
146+
vector_search_took: 35.75µs
147+
```
148+
149+
## Multi-node behavior
150+
151+
In multi-node clusters, the coordinator node aggregates profile data from all shards across all nodes. Each shard profile includes the `node` field identifying which cluster node executed that shard's search. This makes it straightforward to identify performance imbalances across nodes.
152+
153+
## Performance impact
154+
155+
- **When disabled (default):** Zero overhead. A single boolean check skips all profiling code paths.
156+
- **When enabled:** Adds timing instrumentation to each shard search. The overhead is small (microsecond-level timer reads) but measurable under high-throughput workloads. Use for debugging and optimization, not in production hot paths.
157+
158+
## Limitations
159+
160+
- **Response-level only:** Profile data is on `response.query_profile`, not on individual objects. It represents the entire query, not individual result objects.
161+
- **Search phases only:** Profiling covers vector search, keyword scoring, and filter evaluation. It does not include time spent in generative modules, rerankers, or post-processing.
162+
- **No per-object breakdown:** You get per-shard timing, not per-object timing.
163+
- **Metrics vary by query:** Not all metrics appear in every response. Available metrics depend on the search type, index type (HNSW vs. flat), compression settings, and whether filters are used.
164+
165+
## Questions and feedback
166+
167+
import DocsFeedback from '/_includes/docs-feedback.mdx';
168+
169+
<DocsFeedback/>

sidebars.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -634,6 +634,7 @@ const sidebars = {
634634
"weaviate/search/rerank",
635635
"weaviate/search/aggregate",
636636
"weaviate/search/filters",
637+
"weaviate/search/query-profile",
637638
{
638639
type: "link",
639640
label: "Search strategies: In depth",

0 commit comments

Comments
 (0)