-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce unvisitableExtensions
to remove isHTML
implementation
#1230
Conversation
I'm wary of using an exclusion list and assuming that most things are HTML by default. I think the list will just keep growing. But I do like the API you've proposed better than Turbo.htmlExtensions.add(".foobar") The implementation would be as you've proposed, but would check for inclusion in the HTML extensions, rather than exclusion. We'd ship with the defaults already defined by the regex: export const htmlExtensions = new Set([ ".html", ".htm", ".xhtml", ".php" ])
export function locationIsVisitable(location, rootLocation) {
return isPrefixedBy(location, rootLocation) && htmlExtensions.has(getExtension(location))
} EDIT: I guess that won't work when there's no extension – the implementation would need to account for that. |
On the one hand finding the extension is always a educated guess. In our application we have URL's with a filter, containing a date range presentation. Example: On the other hand, we need to think about fallback situations. If the So, I'm still in favor of the proposed set for non-HTML extensions. I'm aware it is not covering all possible extensions, but when it does not cover an extension, the user won't notice and the experience stays the same. I like the |
Thanks for the explanation. I hadn't considered dynamic paths like the one you described. I think your reasoning makes sense. I think we'd need a much larger default exclusion list, however. Just looking at Basecamp and HEY, we have a pretty extensive list of extensions we'll auto-link to: zip, tar, gz, bz2, rar, 7z, dmg, exe, msi, pkg, deb, iso, bmp, mp4, mov, avi, mkv, wmv, heic, heif, mp3, wav, ogg, aac, wma, webm, ogv, mpg, mpeg. While these could return HTML, it's extremely unlikely, and we wouldn't want to start making two requests to resolve them. |
Thinking about it more, I think there are more downsides to an exclusion list than the inclusion regex we have now. It makes more sense to define the extensions that we're confident are HTML than it does to define the ones that aren't. The former is going to be a much smaller set than the latter for all practical purposes. The example with a pathname like |
For the inclusion list/regex to work, it should have perfect coverage. My situation is very specific, but there have been plenty of other examples like FQDN like example.com, Rails ids in paths, .deploy, file extensions, and usernames with a dot in it. I don't see how a default list/regex is going to cover all these situations. If the list/regex is a configuration, a user could extend the list and cover their needs. But even then it is hard to get it covered. Many paths contain dynamic parts that are hard to contain in a list or regex. But I'm also thinking about the developer experience. A new developer starting with Turbo expects Turbo to work all the time. When a part of the requests/refreshes are suddenly page reloads, it is not clear at first hand what is happening. This was my firsthand experience last few days when users complained about random refreshes and we couldn't find the cause because the I don't see a downside for a long(er) exclusion list. We probably can find a reasonable list of say 100 most used file extensions on the web that give no HTML response. If we miss something, the end-user won't notice other than the double request. And the developer can safely extend the list by providing a specific extension. What I was thinking about, is the introduction of the (first) Turbo configuration. Instead of going with the |
Is it too complicated to add a slash at the end? In Rails you can easily do it with the |
Yes, this has all kinds of side effect in our application. Furthermore I believe a library like Turbo shouldn't dictate if slashes are added or not. Rails might have an easy solution for this, but isn't this library intended to be used by other frameworks too? |
How are we going to move forward with this PR? I haven't read any arguments that have convinced me the exclusion list is not going to work, other than some minor inconveniences with maintaining the list. In my point of view this downside does not outweigh the downside of unexpected behavior with full page refreshes due to some random |
Given the original issue, and other solutions that have been proposed (e.g. #519), what are the requirements for a solution to fix this two+ year old issue? It seems to me that we shouldn’t let perfect be the enemy of good: yes, maintaining an exclusion list sucks, and maybe the list in this PR doesn’t cover every case. But there’s currently no clean way to implement what the PR is fixing.
This was the last message on the topic—is your belief the same, @packagethief ? Because it sounds like the API/regex is the issue, not the overall idea. And if that is the issue, then I think that API can be improved in the future; but if you disagree, then maybe this PR can be amended to introduced functionality to make this configuration easier. |
I agree with that. I've come around on the idea of an exclusion list. As long as the default is reasonably comprehensive (there's a longer list in #1230 (comment)) and there's an official way to configure it (which this PR includes), it should be less onerous than having to define your own permitted extensions. If I may though, |
Sorry for the direct ping @edwinv — just wondering if you had time to tidy up the work here! |
@gjtorikian I think the PR is finished, now we have agreed on the exclusion list. I've renamed the variable (thanks for the suggestion @packagethief) and included some more extensions to the list to cover more cases. Ready for a final review! |
treatAsNonHtml
to remove isHTML
implementationunvisitableExtensions
to remove isHTML
implementation
CI fails on a test that does not fail locally with me, was not failing on previous commits and is unrelated to my changes. It seems like a random failure, but I can't retry it to verify this. |
Yeah, a quick search shows that the same test failed last month: #1169 (comment) I reckon it’s flakey. |
With the cleanup made and the flakey test flaking, is this ready to be merged/released? |
Yes! Sorry for losing track. Looks like there's a small conflict to resolve. |
The conflict is caused by #1217, which means that this should be moved to |
The conflict has been resolved in line with the #1217 changes. |
I've identified an issue with the tests in #1217 due to the complete config being On a side note: it took quite some time to find the issue due to Playwright not failing in JS error that occur in the browser. I'm not very familiar with Playwright, but wouldn't it be better to catch all errors and console logs and raise them in the tests? Otherwise some exceptions might not be noticed because we just ignore them. |
All checks are green, ready to merge I guess? I prefer to be ahead of potential new conflicts due to other changes being merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @edwinv. Just a couple of style tweaks and I think we're ready to merge. Thanks for sticking with it 🙏
Co-authored-by: Jeffrey Hardy <[email protected]>
Fixes #608
When Turbo navigates to a new URL,
locationIsVisitable
is called to check if the location is visitable. One part of the check is checking if the requested URL returns HTML. We can never determine with 100% confidence if a request will return HTML. A page with a.html
extension might still return anothercontent-type
. A page with a.foobar
extension might return HTML. Using a check on the file extension is therefore imprecise and prone to bugs.Turbo already handles non-HTML responses. When a response doesn't contain an HTML part, a
contentTypeMismatch
is raised. This causes the page to reload to the requested URL. So we could remove theisHTML
check completely and still have a functioning situation. The major downside is having two requests: one Turbo fetch and one regular browser request.A user could prevent these double requests by adding
data-turbo="false"
to these links. But there might be some sensible default by never handling known non-HTML extensions as a Turbo request. This PR introduces theTurbo. unvisitableExtensions
setting to enable this.By default, a range of common non-HTML extensions are included in this setting, like .png and .jpg. If users have many links to other extensions and don't want to sprinkle their HTML with
data-turbo="false"
attribute, they can add new extensions to the set:Because these changes are added to the lower-level checks, they work both for regulars visits and the newly introduced page refresh morphs.
Mentioning @packagethief @dhh @jorgemanrubia for some visibility.