-
Notifications
You must be signed in to change notification settings - Fork 319
304 handling for fetch() #412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently Blink's implementation doesn't support 30x (it just fails) Was there any discussion about this? We believe this is a minor caveat that we could fix later. Meanwhile, affected developers should be able to avoid the issue (e.g. by avoiding any redirection). Let me know if this is narrow minded. As for the conditional headers, I'm not sure what would happen in Blink and if/how that would be wrong. Any extra details? Thanks. |
cc/ @annevk |
Note that 304 is a very special case among 30x. 30x is redirects and they should work. 304 is about caching and should also do something one way or another, it cannot just fail. Either it is passed through or you get a 200 out. Conditional headers are headers such as |
Talked to Horo-san, here is what Blink does: "if the content is in the browser cache and "Last-Modified:" is set, fetch() sends the request with "If-Modified-Since" header. |
@annevk I think this behavior is reasonable. We could expand later for more flexibility as needed (in which case we probably don't need the impact MVP label on this issue). Let me know if I'm missing something or if there are specific scenario that you are worrying about. |
Does Blink do the same with What if the web developer sets the Transforming 304 into 200 is precisely the kind of "magic" that we were hoping to avoid with |
@annevk re etag, I assume Blink does the same thing (will check with Horo-san and ask about the second question). I think the magic by default, crafty as an option approach is reasonable. It sounds that this issue might not be "impacts MVP" (feels like an enhancement). |
Horo-san checked fetch's behavior in Blink against the scenarios mentioned here and confirmed that it's exactly the same as XHR. |
So currently XMLHttpRequest requires that if the developer sets the conditional header (e.g. |
It seems logical that setting If-Modified-Since: or If-None-Match: explicitly triggers sending back the 304 at the API level, I can imagine SW wanting to refresh cached content that way, altering only caching metadata but not the content (http://tools.ietf.org/html/rfc7232#section-4.1 and http://tools.ietf.org/html/rfc7234#section-4.3.4) If not set, then it's up to the native browser cache to generate IMS/INM digest the (potential) 304 back and give back a 200 at the API level (where it could be detected as from the cache via an Age: header for example, if absolutely needed, see http://tools.ietf.org/html/rfc7234#section-5.1 ) |
Yes. I'm adding tests for this. |
I last gave my thoughts in #348 (comment). I think making caching work transparently (which I guess means translating 304s to 200s) is a good default, and a raw-mode later would be good too. I don't feel that strongly, but to me using the presence of If-Modified-Since or If-None-Match to switch to raw-mode seems bad. I'd rather that be done explicitly, so that there is only one flag (e.g. |
I agree with @domenic. @horo-t, if you have not implemented the XMLHttpRequest behavior yet, would you consider not doing so and instead adding an explicit flag to get back 304 always? By the way, if the server gave back a 304 and there's a cache miss, does the API get a 304 or a network error? It would be good if we finally nailed all of this down, and ideally in the same way for both XMLHttpRequest and fetch(). |
@annevk Isn't the default behavior (XHR) just fine? I'm not sure what you meant by "not doing so and instead [...]". We should do the raw-mode flag later as an enhancement. For the second question, I assume that this could happen if for some reason the cache got corrupted or modified/deleted. I think we should go all the way and give back the network error by default. The raw mode should give back the 304. |
@domenic, if you prefer an explicit switch, then you should have a failure mode when setting IMS/INM in "basic" mode, at least an indication that it was not set. |
@KenjiBaheux no, I don't think that if the developer set one of a magic set of headers, they suddenly get back a 304 whereas they normally do not. It's somewhat weird and magical. |
I've already implemented the same behavior of XMLHttpRequest in fetch API. I don't have strong opinions but I think that passing the 304 response to the caller of fetch API is reasonable. |
@horo-t okay. I guess I can live with properly defining that in Fetch plus defining a flag to turn off the automatic 304 -> 200 mapping. To be clear, we only disable the automatic 304 -> 200 mapping at the moment if there's a header set whose name is one of |
Here is how this should work.
The default is Note how the second case is different from XHR. |
The |
@mayhemer it's not entirely clear to me how to deal with partial content. Does us handling that mean we also pay special attention to a 206 response? What's the logic here, if we notice we have a partial cache entry, we set |
…partial content or an API
@mayhemer if I set my cache mode to |
I like the idea of I guess we wouldn't expose |
(btw the rest looks great) |
I guess it makes it easier to obtain what's already obtainable through a timing attack. |
If it gets a "NO" from security, it could be restricted to same-origin (or if the cached response has CORS headers) |
Not to bikeshed, but does "isFromCache" or "fromCache" refer to the HTTP cache? I assume it does. What about Response objects returned from the Cache API? I assume they shouldn't set this? |
@jakearchibald @slightlyoff @annevk Would this cache bypass mechanism address the duplicate data concerns with the Cache API? If an origin knows it will be storing large assets in the Cache API, then it could make the decision to bypass the http cache to avoid wasting resources. |
@wanderview : the duplication questions about Cache API aren't about the Cache API, they're about implmentation quality. The Cache API already makes it possible to entirely de-duplicate resources on-disk. If there's an issue, it's because an implementation isn't good (yet), not because the spec is any sort of problem child. |
@slightlyoff I agree its possible, buts its added complexity. I was merely suggesting that this proposed spec change might reduce the need to pay for that complexity. |
@wanderview : it's not added complexity for moderately advanced HTTP caches, and it's still not a spec issue. HTTP caches should already be de-duping based on hashes and only storing metadata independently. There are even filesystems and DBs which implement this, meaning that as long as the blobs are stored independently from the metadata, some impls might even get this for free. |
@slightlyoff spittake If only. De-dup was done for a brief time by Inktomi Traffic Server in the early '00s (according to their release notes and friends who worked on it), but that got ripped out sometime before it got Open Sourced as Apache Traffic Server. To my knowledge, none of the browser caches do it (happy to be proven wrong), and other proxy caches have looked at it to varying degrees, but either backed away slowly, or said "yeah, you should use a de-deplicating filesystem for that." Anyway. |
@annevk I like how that looks. Just to be annoying, I'll map from the proposal to HTTP request cache-control directives:
Now, I'll be the last to defend HTTP's choice of terms for these concepts, but is it really good to introduce yet another set of terms for the same things? |
"max-age=[some high number]" isn't really a term we could reuse, but agreed about the others. To be clear, this still needs security review as I'm not entirely convinced exposing this much of the cache is going to work. |
WFM, thx. "force cache" is almost something like "disregard freshness". If you can think of something similar that's zippier, it might be a better fit. |
bypass != no-store !!! "Cache-Control: no-store" means to cache, but not persistently. HTTP's "no-store" is actually "session only". "bypass" in this spec draft means "there is no cache at all = don't read from cache + don't write to cache" |
@mayhemer the description in https://httpwg.github.io/specs/rfc7234.html#cache-request-directive for no-store seems to disagree. |
@annevk no, it perfectly confirms what I say and reflects how we handle no-store in Gecko. ' "MUST NOT store" in this context means that the cache MUST NOT intentionally store the information in non-volatile storage ' It means, no persistent storage media can be used. But there is nothing said about non-volatile one. Non-volatile can be used. |
@mayhemer you're selectively reading the text. It very clearly says:
|
"store" - what is the meaning of this word exactly? |
BTW, I just repeat what Gecko does when Cache-control: no-store header is received on a response. The main point I originally wanted to make was that "bypass" doesn't mean "no-store" (which could be ambiguous with the cache-control header value.) |
If you go on to actually read the linked text, it says:
|
I opened three new issues for the remainder of this thread:
If there's anything I missed please let me know. |
Per #398 we need to decide how to handle 304 responses for
fetch()
and we need to decide if developers setting conditional headers has any "magic" effects. Implementations for XMLHttpRequest reportedly do different things, though I have not tested this recently. https://www.w3.org/Bugs/Public/show_bug.cgi?id=22414 has additional context.The text was updated successfully, but these errors were encountered: