Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "#342 #372 - Try to prevent caching of request" #436

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jwatzman
Copy link

This reverts commit a5e38c4.

It's not entirely clear looking at the history, but it seems this was added to fix a suspected caching problem in #342. The RSS feed in that task doesn't exist any more, but looking at the rest of the site at least as it exists today -- it's not a problem with Firefox/Livemarks, it's a problem with the site sending broken Cache-Control headers! They are saying their response can be cached for up to a year, which is obviously wrong for a news site.

But that is a problem with the site, not with Firefox/Livemarks. We really should not be ignoring the correct caching headers for every single other RSS feed on the internet to solve one broken site. In particular, since the default refresh interval for Livemarks is 5 minutes, this causes us to force-load all of the RSS feeds every 5 minutes, which tbh is a bit abusive.

This bad behaviour on the part of RSS readers has been called out a few times, most recently https://rachelbythebay.com/w/2023/01/18/http/ -- let's not be one of the bad ones, especially just to fix a single site whose RSS feed does not exist any more anyway.

This reverts commit a5e38c4.

It's not entirely clear looking at the history, but it seems this was
added to fix a suspected caching problem in nt1m#342. The RSS feed in that
task doesn't exist any more, but looking at the rest of the site at
least as it exists today -- it's not a problem with Firefox/Livemarks,
it's a problem with the site sending broken `Cache-Control` headers!
They are saying their response can be cached for up to a year, which is
obviously wrong for a news site.

But that is a problem with the site, not with Firefox/Livemarks. We
really should not be ignoring *the correct caching headers for every
single other RSS feed on the internet* to solve one broken site. In
particular, since the default refresh interval for Livemarks is 5
minutes, this causes us to force-load all of the RSS feeds every 5
minutes, which tbh is a bit abusive.

This bad behaviour on the part of RSS readers has been called out a few
times, most recently https://rachelbythebay.com/w/2023/01/18/http/ --
let's not be one of the bad ones, especially just to fix a single site
whose RSS feed does not exist any more anyway.
@evilpie
Copy link
Collaborator

evilpie commented Jan 28, 2023

I don't think we can revert this without regressing other feeds like #256 or #353. We probably have more reports that I haven't been able to find quickly.

But you are also right that updating every 5 minutes is excessive, especially when multiple people are updating the same feed. We should maybe increase the default update interval to something like 15 minutes.

@jwatzman
Copy link
Author

Do we have an understanding of what the actual underlying issue is? (I don't see real explanation on those two issues, beyond "we've disabled caching".) It's not obvious to me there was any underlying issue, FWIW -- looking through some of those sites as of today:

  • https://www.flyertalk.com/forum/external.php?type=RSS2&forumids=446 -- sets cache-control: max-age=1674919524 and also expires: Sat, 28 Jan 2023 15:25:24 GMT (which is about 3 minutes from when I sent the request) -- per RFC7234 section 5.3, max-age takes precedence -- so the server is telling us to cache for 50+ years which is broken on their end.
  • https://www.gunnerkrigg.com/rss.xml -- sets both expires and cache-control: max-age to about two days. Looking at their posting frequency, that seems about right. But it could mean it could take a little bit of time for a post to show up, but that's what they asked us to do in their headers.
  • https://3downnation.com/feed/ -- sets pretty aggressive flags that say not to cache. (It's possible this used to be broken on their end and is now fixed.)

Do we think Livemarks or Firefox is/was actually ignoring or misinterpreting the caching headers, or are we just trying to work around broken sites like flyertalk? (And perhaps impatient people for sites like gunnerkrigg.) If there is actually an underlying caching bug, have we reported it to Firefox's bugzilla? (I'd be surprised at such a thing, but it's not impossible!)

The spec has this sort of caching for a reason, to prevent sites/RSS feeds from getting hammered. A force-refresh even every 15 minutes is super excessive for a site like gunnerkrigg who updates only every 2-3 days, or like rachelbythebay which updates far less often. The correct interval of time for a refresh is what the site tells us that it is, via caching headers (and via things like etag or modified-since etc which can make updates appear quickly without force-refreshes, this is all stuff Firefox supports well). It seems in quite poor form to ignore polite requests to "please not hammer me" from every site on the internet because one or two like flyertalk is incorrectly configured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants