Skip to content

Performance on large Monorepos #118

@dziemba

Description

@dziemba

We're currently using target-determinator for our monorepo and are quite happy with it.
As we are moving more components into the monorepo, we are notificing that the performance of target-determinator is becoming problematic. Since we run it on each PR CI build, we would like the runtime to be as low as possible.

Here's some data for our monorepo with ~2000 components.

00:00:00 Processing revision 'before' (sha: [redacted])
00:00:08 Running cquery on deps(//...)
00:01:41 Running cquery on //...
00:02:10 Finding compatible targets under //...
00:02:35 Matching labels to configurations
00:02:40 Hashing targets
00:02:59 Processing revision 'after' (current working directory state)
00:03:03 Running cquery on deps(//...)
00:04:18 Running cquery on //...
00:04:48 Finding compatible targets under //...
00:05:12 Matching labels to configurations
00:05:17 Hashing targets
00:05:38 Finished after 5m37.207708484s

Profile for bazel cquery deps(//...) --output_file=/dev/null --transitions=lite --output=streamed_proto
Image

It seems like we're spending most of the time in the cquery calls, and they're bottlenecked by a single-threaded action for most of their runtime. Additionally it might be possible to do some concurrent work in target-determinator itself.

In our primary use-case of running the diff on PR builds against the latest master branch, we're also doing a lot of duplicate work. Instead of checking out and doing all the queries/computations on the before/master branch, we could just fetch some pre-computed data from a previous build.

I would love to hear everyone's thoughts on this, in particular where the best opportunities for optimizations are and which of them would be possible to add to this project. I'm happy to contribute with implementations. If you would like any extra benchmarking data, I can provide that as well of course.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions