Skip to content

Add minimum version tracking to deleted crates #10016

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

LawnGnome
Copy link
Contributor

As discussed in #9904 and #9912, this PR:

  1. Adds a column to the deleted_crates table that indicates the first version that can be used when publishing a new crate version with the same name.
  2. Populates that column when crates-admin delete-crate is run.
  3. Checks that column when publishing a new crate version.

Implementation wise, most of the noise here is actually moving the deleted_crates publish checks into a sub-module, mostly because the original version I wrote inline in the controller was ugly as hell. Instead, we now get to enjoy a slightly overengineered state machine. I'm open to suggestions.

(Although, on the bright side, if we ever do need the more complex semver compatibility check rather than just checking against a minimum version, having the validation logic separated out already should make that easier to implement.)

I added new integration tests for the controller, but most of the specifics are tested in unit tests in said new sub-module. Redundant? Probably. 🤷

As for the deletion side of things, we're going to need the same min_version population behaviour (specifically the use of TopVersions) that I added to crates-admin when #9904 lands, but I was struggling a bit to figure out how to integrate that into models::krate, since crates-admin never actually constructs a models::krate::Crate when deleting a crate. I played around with doing a partial build of NewDeletedCrateBuilder based on a Crate, but the type signatures were driving me slightly bonkers, and it didn't really feel cleaner anyway, so I just did it inline for now.

Honestly, it might be simple enough that it's easiest to just copy/paste/adapt the same general logic into #9904.


I have a couple of other, more specific open questions that I'll throw into the inline review after I publish this, but I guess the main question here is whether this feels directionally correct.

This calculates the next semver-incompatible version of the crate being
deleted and inserts it as `deleted_crates.min_version`.
@LawnGnome LawnGnome added C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works A-backend ⚙️ labels Nov 20, 2024
@LawnGnome LawnGnome requested a review from Turbo87 November 20, 2024 09:20
@LawnGnome LawnGnome self-assigned this Nov 20, 2024
macro_rules! assert_result_failed {
($result:expr) => {{
let text = assert_result_status!($result);
assert_snapshot!(text);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to write the snapshots to files because long lines make my eye twitch, but I don't feel very strongly about this.

Comment on lines +127 to +131
let response = $result.unwrap_err().response();
assert_eq!(response.status(), StatusCode::UNPROCESSABLE_ENTITY);

String::from_utf8(
axum::body::to_bytes(response.into_body(), usize::MAX)
.await
.unwrap()
.into(),
)
.unwrap()
}};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about extending tests::util to be able to use the same response helpers on a bare axum::response::Response without the full boilerplate to fake a full request, but decided to keep this more self-contained for now. I can bring that code back if that sounds useful, though.

Comment on lines +110 to +105
"A crate with the name `{name}` was previously deleted.\n\n* {}",
messages.join("\n* "),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the shitty version of our conversation about representing multiple errors in a single response in #9904. 💩

@Turbo87
Copy link
Member

Turbo87 commented Nov 20, 2024

I guess now comes the question: is it worth it? 😂

one potential issue with this I discovered yesterday: when someone name-squats foo@100000000000000.0.0 and deletes the crate again. would that now cause others to only be able to publish foo@100000000000001.0.0 and upward from there? 🫣

as mentioned in #9912 (comment), I'm curious what the situation at other registries looks like regarding this situation, for user-issued deletions, but also when admins delete packages/versions for whatever reason.

If a crate has been deleted before, `deleted_crates.min_version` is
enforced when validating a crate version that is being published.

This also refactors the deleted crate check in general into a new module
to make it easier to unit test and reason about. Mostly the latter,
honestly.
@LawnGnome LawnGnome force-pushed the deleted-crates-min-version branch from 9c7f396 to edbf30a Compare November 20, 2024 09:29
@LawnGnome
Copy link
Contributor Author

as mentioned in #9912 (comment), I'm curious what the situation at other registries looks like regarding this situation, for user-issued deletions, but also when admins delete packages/versions for whatever reason.

All right, all right, I'll go actually ask some people.

grumbles about this being as bad as having to phone someone on an actual phone

@LawnGnome
Copy link
Contributor Author

one potential issue with this I discovered yesterday: when someone name-squats foo@100000000000000.0.0 and deletes the crate again. would that now cause others to only be able to publish foo@100000000000001.0.0 and upward from there? 🫣

So, yeah, that's an issue.

On the other hand, there's also kind of an issue here with the "specifically allow semver-incompatible versions" rule as well, which is that it's potentially hard to explain. If someone publishes, say: 0.1.0, 0.2.0, 0.4.0, 1.0.0, 3.0.0, then deletes the crate, then a new maintainer would only be able to publish 0.3.x, 0.5.x, 2.x.y, 4.x.y, and so on. Which is probably going to be challenging to communicate effectively in an error message.

Of course, a crate having that set of versions in the first place might be a sign that the name is associated with someone with... unusual ideas on versioning. 😆

@bors
Copy link
Contributor

bors commented Nov 22, 2024

☔ The latest upstream changes (presumably e04e4e2) made this pull request unmergeable. Please resolve the merge conflicts.

@Rustin170506
Copy link
Member

I guess now comes the question: is it worth it? 😂

Here are two very specific examples. I think if we allow version re-release, we will encounter these two issues. However, if we don't allow re-releasing the same version, perhaps the two issues will merge into one: it will just become "not found."

rust-lang/cargo#10071
rust-lang/cargo#10063

@Turbo87
Copy link
Member

Turbo87 commented Nov 25, 2024

IMHO these seem like bugs on the cargo side, though admittedly they are caused by the assumption mismatch that versions/crates will just never ever disappear from crates.io, which for various reasons we cannot guarantee.

but again, with the set of requirements that we enforce for people to be able to delete crates themselves, it seems unlikely for people to regularly hit these sorts of issues. I'm not saying it will never happen, but I don't see it as being a very likely scenario for most people.

@Rustin170506
Copy link
Member

IMHO these seem like bugs on the cargo side

Yes, I'm not suggesting that crates.io completely fix these two issues. I'm just saying that, as the default registry for Cargo, we have an opportunity to reduce the likelihood of these situations occurring. Because, as you can see, fixing mismatches requires significant effort from Cargo (e.g., modifying the fingerprints). So I'm wondering if we could reduce the chances of this happening in the default registry, wouldn't that minimize the potential impact on users as much as possible?

@Turbo87
Copy link
Member

Turbo87 commented Nov 25, 2024

I'm wondering if we could reduce the chances of this happening in the default registry, wouldn't that minimize the potential impact on users as much as possible?

yeah, we could. but ultimately it's a tradeoff that needs to be balanced with all of the other arguments for and against this that were brought up in the comments of this PR (and the others).

I'm personally not convinced yet that it's worth the extra complexity for our (publishing) users, but ultimately the crates.io team needs to decide this.

@LawnGnome
Copy link
Contributor Author

I'm personally not convinced yet that it's worth the extra complexity for our (publishing) users, but ultimately the crates.io team needs to decide this.

I see this is already on the agenda for Friday, so I guess we can defer this until then.

@LawnGnome
Copy link
Contributor Author

OK, so we discussed this at the team meeting, and I think I'm convinced enough that the restrictions on crate deletion in the RFC are sufficient to make the risk low enough that we can just scan for this ex post facto, thereby sidestepping awkward questions around how we communicate this to users.

Closing.

@Turbo87
Copy link
Member

Turbo87 commented Dec 3, 2024

we discussed this at the team meeting

to expand and summarize a bit more on this for those that were not involved in that meeting:

  • we acknowledge that there is generally a risk involved in allowing this
  • we understand that it would be unexpected to have certain version numbers blocked for people publishing a seemingly non-existent crate/version
  • we reduce the risk by allowing deletions only for crates that are not commonly used (published for less than 72 hours, or if the crate has a single owner, has been downloaded less than 100 times for each month it has been published, and is not depended upon by any other crate on crates.io)
  • corner case: a crate on crates.io is only popular within a company that uses a caching proxy for crates.io, so we don't see the high download numbers
  • the Rust Foundation security initiative will track republished versions outside of crates.io and issue warnings to the crates.io maintainers if potentially malicious activity is detected

Due to these factors above the crates.io team concluded that we will not implement a hard block on republishing previously existing versions of crates that were deleted, and will rely on the out-of-band detection by the Rust Foundation and other third parties instead.

Note that this is not a permanent decision, and if we see that these rules are insufficient we can adjust them accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants