Skip to content

Maintain dynamic count of URLs per crawl ID#141

Closed
klockla wants to merge 1 commit intocrawler-commons:masterfrom
klockla:crawl-stats
Closed

Maintain dynamic count of URLs per crawl ID#141
klockla wants to merge 1 commit intocrawler-commons:masterfrom
klockla:crawl-stats

Conversation

@klockla
Copy link
Copy Markdown
Collaborator

@klockla klockla commented Nov 14, 2025

This PR introduces a new API call: GetCrawlStats which will return the number of completed and total URLs per crawl.

Signed-off-by: Laurent Klock <Laurent.Klock@arhs-cube.com>
@klockla klockla added this to the 2.6 milestone Nov 14, 2025
@klockla klockla self-assigned this Nov 14, 2025
@klockla klockla added the API label Nov 14, 2025
@jnioche
Copy link
Copy Markdown
Collaborator

jnioche commented Nov 14, 2025

sorry don't have time to look at it in details but how is it different from


     /** Return stats for a specific queue or an entire crawl. Does not aggregate the stats across different crawlids. **/
     rpc GetStats(QueueWithinCrawlParams) returns (Stats) {}

@klockla
Copy link
Copy Markdown
Collaborator Author

klockla commented Nov 14, 2025

sorry don't have time to look at it in details but how is it different from


     /** Return stats for a specific queue or an entire crawl. Does not aggregate the stats across different crawlids. **/
     rpc GetStats(QueueWithinCrawlParams) returns (Stats) {}

yes, I see your point now :) What I really needed was the total nb of URLs per crawl, I got a bit confused by GetStats and the concepts of active URLs (I mixed it at some point with "in process") and I didn't realize that the total number of URLs can be retrieved with Getstats: active + completed URLs....so I guess we can conclude that there is no need for this PR.

@klockla klockla closed this Nov 14, 2025
@klockla klockla deleted the crawl-stats branch January 8, 2026 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants