Performance on large Monorepos

We're currently using target-determinator for our monorepo and are quite happy with it.
As we are moving more components into the monorepo, we are notificing that the performance of target-determinator is becoming problematic. Since we run it on each PR CI build, we would like the runtime to be as low as possible.

Here's some data for our monorepo with ~2000 components.

```
00:00:00 Processing revision 'before' (sha: [redacted])
00:00:08 Running cquery on deps(//...)
00:01:41 Running cquery on //...
00:02:10 Finding compatible targets under //...
00:02:35 Matching labels to configurations
00:02:40 Hashing targets
00:02:59 Processing revision 'after' (current working directory state)
00:03:03 Running cquery on deps(//...)
00:04:18 Running cquery on //...
00:04:48 Finding compatible targets under //...
00:05:12 Matching labels to configurations
00:05:17 Hashing targets
00:05:38 Finished after 5m37.207708484s
```

Profile for `bazel cquery deps(//...) --output_file=/dev/null  --transitions=lite --output=streamed_proto`
<img width="2308" height="587" alt="Image" src="https://github.com/user-attachments/assets/0d933293-4b69-41f7-8f46-7fa2f6b0b7a0" />

It seems like we're spending most of the time in the cquery calls, and they're bottlenecked by a single-threaded action for most of their runtime. Additionally it might be possible to do some concurrent work in target-determinator itself.

In our primary use-case of running the diff on PR builds against the latest master branch, we're also doing a lot of duplicate work. Instead of checking out and doing all the queries/computations on the before/master branch, we could just fetch some pre-computed data from a previous build.

I would love to hear everyone's thoughts on this, in particular where the best opportunities for optimizations are and which of them would be possible to add to this project. I'm happy to contribute with implementations. If you would like any extra benchmarking data, I can provide that as well of course.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance on large Monorepos #118

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance on large Monorepos #118

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions