[WIP] PoC for a Node.JS backend using nodegit#259
Conversation
| if (context.path === undefined) { | ||
| commits = allCommits; | ||
| } else { | ||
| /// CAUTION(this can be pretty long) |
There was a problem hiding this comment.
A simple way to speed this up is by not fetching the entire diff, but comparing the parents' trees to the commits' trees (without actually looking at the file contents). Looking at .coveragerc history takes around 30 s for me.
There was a problem hiding this comment.
See https://github.com/jonashaag/klaus/blob/purepy-hist/klaus/repo.py#L199 for an implementation
There was a problem hiding this comment.
(and 30s vs. how long with this current implem?)
Yes I think you make a good point. Will have to experiment and re-implement.
There was a problem hiding this comment.
Sorry, should've been more specific. I'm talking about huggingface/transformers/commits/master/.coveragerc. It takes around 30 s, vs time git log .coveragerc => 145 ms
There was a problem hiding this comment.
In Klaus I implemented it simply by calling git log using subprocess. I used the pure Python implementation linked above but it was very slow.
There was a problem hiding this comment.
Ok, will take a stab at this now (unless you want to do it? :) )
There was a problem hiding this comment.
No, go for it... I invested 20 min into coming up with a draft but realised it's going to take me much longer than someone proficient in TypeScript
There was a problem hiding this comment.
I also don't know if what I came up with in Klaus is the best way to do it (but it works)
There was a problem hiding this comment.
Interesting SO question here: https://stackoverflow.com/questions/21178383/how-is-gits-log-filename-implemented-internally
|
|
||
|
|
||
| app.get( '/:repo/commits', historyCommits); | ||
| app.get('/:namespace/:repo/commits', historyCommits); |
There was a problem hiding this comment.
This URL scheme doesn't allow for a repository called "commits" in a namespace, ie. somenamespace/commits. That's why I decided to use the ~ prefix in Klaus. Maybe there are other ambiguous names, I haven't checked thoroughly.
There was a problem hiding this comment.
Yes, we are kind of constrained in the URLs we have to support (as they already exist) so I'll just hardcode a blacklist of forbidden repo names
|
|
||
| /** | ||
| * Note: we only support branch names and tag names | ||
| * not containing a `/`. |
There was a problem hiding this comment.
I see where you're coming from but at least in the software engineering world feature branches (feature/xyz) are very common.
There was a problem hiding this comment.
Yes – for a v1 for the sake of simplicity I think this will work but we can always add support later
|
Reviewed most of the code and also did some performance tests on my machine. Implementation and performance look very good. libgit2 performance is 1–2 order of magnitudes better than Dulwich performance, apparently. I added a few minor review comments. I don't think there are any obviously bad implementation choices. I suggest that you performance monitor as much as possible in production: Some of the history-based views are unbounded (eg. counting number of commits) and may turn out to be too slow. It probably also makes sense to add caching here and there if you have an easy way to do cache invalidation (with the current refresh mechanism the page is essentially static so you could just cache everything). |
This reverts commit 2688f86.
No description provided.