-
-
Notifications
You must be signed in to change notification settings - Fork 348
[Azure Devops Repos] clone failure as packs use REF_DELTA objects: fix with git-style delta resolution #1025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for reporting. The effort to provide a reproducer is appreciated, even though I'd hope for a public repo URL that can be downloaded with it. Also, which version of
A BufReader is used internally and 8192 is the default buffer size.
This means that the remote sent a thin pack, which referenced a local object that is supposed to exist, for use as a base object. The only reason I imagine this could happen is if the negotiation step leaves the server incorrectly thinking that this object exists locally, but since |
Contrary to my understanding, it is possible to have public repos on Azure DevOps with an MSDN subscription. I created @Byron As you correctly noted, the example code was not on gitoxide main branch (it was release 0.51.0 if it matters). I updated my code to a git ref and re-ran my code on above repo. The read pack stage now seems to run to completion ( The hanging behaviour is a property of my code in that the I'll be happy to dive deeper, but it may be some days before I have the time. |
I added a trivial repository https://dev.azure.com/bittrance/public/_git/die-kleine that can be used to verify that very small repos can be cloned successfully. |
That's perfect, thank you! I will take a closer look but believe to have everything I need for a fix. First of all, If you are keen to investigate before me, I have a couple of things to finish until the 25th (at most), you could use Please note that I couldn't reproduce it with the SSH url as it doesn't know me, but that won't be necessary as you validated it's the same behaviour. That's great as it means it's unrelated to the underlying transport. |
InvestigationAfter a short moment of investigation it is clear that this is the code that cancels the operation: the pack, which is supposed to be complete, contains a ref-delta object, which means in order to apply this delta, it needs the named base object. Since the object doesn't exist - after all, the local repo is being populated - the lookup fails and everything unwinds. I think what's happening here is the server is old enough to encode delta-bases with full hash instead of just the offset to a prior object. Maybe this is done because they don't sort their objects so that the base actually appears before the object referencing it, or maybe there is another reason.
Diving deeper, it turns out that the other end is a V1 server, which responds exactly the same for both
From there
and
Note that both implementations ask for Version 2 of the protocol, but only get V1.
I may repeat the first sentence of the documentation of
If that's to be believed, the server definitely sends an invalid pack as the client never advertised any With a quick-hack I removed the
(The request doesn't mention "thin-pack" anymore) However, the server still answers with a thin pack, which is another reason for assuming the other side is some custom, none-standard implementation. Next up was to see if
(one file received by After removing the leader of the file which had a varying bit, the amount of milliseconds to ready the pack, I get this:
Indeed, it's the same pack is sent no matter what. Further StudyHere is what I find amazing:
…just because it's utterly impossible for Reading further, I found the I even tried sending Looking into it further, I realized that Anyway, I think in Can
|
Well, that was disappointing, but not entirely surprising. I had the "Test with azdo" task high on my todo because I know the azdo Git/SSH server is homegrown and quirky. I note that go-git also has problem supporting azdo: go-git/go-git#64 (or the archived src-d/go-git#1058) . go-git does not support mutli_ack so it gets a somewhat different issue. I respect the decision to kick this can down the road. AzDO is relatively common in the (MS dominated) corporate world so I'll leave it up to you to decide whether you want to keep this issue open for discoverability or close it. Thank you for your detailed problem analysis. ❤️ |
I think there is value in having a possibly user-selectable alternative implementation for pack resolution, and think that it could work very similarly to how Maybe one day Microsoft choses to update their server side and there is no demand anymore, or they sponsor this feature, let's wait and see. |
This emulates more of git's Windows-specific behavior for launching interactive editors. When GIT_WINDOWS_NATIVE is defined, git uses `mingw_spawnvpe()` to spawn the editor which, in turn, calls a `parse_interpreter()`. The proposed editor command is run as-is if it has a .exe extension. Otherwise, the command path is opened, read, and parsed to find a shebang line with an interpreter path. This plays out in run-command.c and compat/mingw.c in Git's source code. The Windows CI is updated to set SHELL_PATH to something suitable for the msys2 environment; `/bin/sh` is **not** a viable path in this environment. Refs: #407
Hi,
Is it possible for this issue to affect some repositories on Github too or should I open a new issue for this? |
Thanks for letting me know, and for finding just the right issue to put it in! It's really good to know that this isn't exclusively an Azure DevOps problem. Maybe it has been a good problem a couple of years ago, and Azure never moved on from that. That theory seems to be somewhat justified by seeing that the most recent commit in To put it to the test, I forked the Repo, reproduced the issue in the new fork, and made minor changes to the Readme. The issue still reproduces perfectly, probably due to Git just doing a pack-to-pack copy so the objects stay exactly the same.
|
This issue has showed up for me when implementing a toy git server with gitoxide's infrastructure. When pushing to a completely empty repo, my local |
Thanks a lot for sharing, even though I definitely don't like what I hear. After all, this would mean that for some reason, recent Git also does that even though it really shouldn't make any sense to do - size-wise it's always better to use offset-deltas to refer to previous objects, so that ref-deltas are only used to refer to out-of-pack objects (which don't exist if the remote is empty). |
Ran into this too. As I need to support many different git forges, I'll go with spawning a sub process. Too bad, as I'd like to report the progress at some point, and fetching is the only place where I'm still not using gitoxide :) On a positive note, all my other operations (tree traversal, commit traversal and getting blobs) seem to work about 3-10x faster than libgit2, so I'm very happy with gix :) |
Indeed, using Git for that is what many have to do, nothing (viable) quite reaches its portability just yet. With that said, happy to hear that |
Sorry for going off-topic. It could be that my usage of git2 wasn't optimal, or that the different gitoxide api's allow me to skip some steps. If I have some time I'll create an example project with some benchmarks. |
Current behavior 😯
It seems that Azure Devops (at least over SSH) has some form of pagination when reading packs. Given the code below, fetching a trivial repo (one file, one commit) will work correctly produce the following output:
When fed the URL for a a non-trivial repo, output will look like this. Note the suspicious 8192 on read pack progress. I've tested a few different repos and they all get 8192 "pack steps".
At this point it hangs. Eventually, the timeout occurs and the following error is raised:
Tho I suppose this is just a consequence of us not having received the object in question.
I'll try to dig deeper tomorrow.
Expected behavior 🤔
If it works for small Azure Devops repos, surely it should work for large :) The same code has no problem fetching a GitHub repo of non-trivial size.
Steps to reproduce 🕹
Code to reproduce the problem:
The text was updated successfully, but these errors were encountered: