-
Notifications
You must be signed in to change notification settings - Fork 31
Description
We're currently using target-determinator for our monorepo and are quite happy with it.
As we are moving more components into the monorepo, we are notificing that the performance of target-determinator is becoming problematic. Since we run it on each PR CI build, we would like the runtime to be as low as possible.
Here's some data for our monorepo with ~2000 components.
00:00:00 Processing revision 'before' (sha: [redacted])
00:00:08 Running cquery on deps(//...)
00:01:41 Running cquery on //...
00:02:10 Finding compatible targets under //...
00:02:35 Matching labels to configurations
00:02:40 Hashing targets
00:02:59 Processing revision 'after' (current working directory state)
00:03:03 Running cquery on deps(//...)
00:04:18 Running cquery on //...
00:04:48 Finding compatible targets under //...
00:05:12 Matching labels to configurations
00:05:17 Hashing targets
00:05:38 Finished after 5m37.207708484s
Profile for bazel cquery deps(//...) --output_file=/dev/null --transitions=lite --output=streamed_proto

It seems like we're spending most of the time in the cquery calls, and they're bottlenecked by a single-threaded action for most of their runtime. Additionally it might be possible to do some concurrent work in target-determinator itself.
In our primary use-case of running the diff on PR builds against the latest master branch, we're also doing a lot of duplicate work. Instead of checking out and doing all the queries/computations on the before/master branch, we could just fetch some pre-computed data from a previous build.
I would love to hear everyone's thoughts on this, in particular where the best opportunities for optimizations are and which of them would be possible to add to this project. I'm happy to contribute with implementations. If you would like any extra benchmarking data, I can provide that as well of course.