-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option for passing remote url encoded? #250
Comments
Yeah, I'm totally supportive of adding that. I suspect the performance hit would be negligible, and we might be able to come up with some clever shortcuts to keep the common case faster (e.g. check if raw string before decoding starts with "http", in which case try to use it as-is, etc) |
Well if the param isn't encoded, we shouldn't decode it as that may decode encoded params within the url and break things. Pseudocode...
My opinion is that for these special-case url decoding algorithms where we expect a parameter to contain a URL, it is best to check for It is easy to support urls that have been double/triple/etc encoded as well by iterating the decode (up to a limit) and checking to see if we are done before each iteration (we don't want to over-decode because we may decode encoded parameters in the url that will then break the URL). Checking for just My go isn't great but I can look into creating a pull request for this at some point if that is interesting for you. |
Howdy. I started looking at what would be involved to make this change. I was wondering if you could clarify which direction you want to go with
However the relative URL case doesn't appear to be covered in Line 114 in 52f4360
The comment on Line 323 in 7e21abe
...but the code currently implements the relative url case Line 355 in 7e21abe
The only reason this is a problem is because of the following:
If options weren't optional (maybe For instance the comment on
Both of those omit the options however the first one could in some cases be intended as a relative url...ie I might have set a I could definitely imagine a case where you want to run all your upstream urls through an additional imageproxy that does some other type of transform (e.g. background replacement) so you might pass absolute URLs to imageproxy but also set This Line 360 in 7e21abe
is confusing as well because the user doesn't need to provide an absolute remote url... baseURL.ResolveReference(req.URL) just needs to be absolute.
Lots of ways to fix that ambiguity (use different parameter delimiters, require options, etc)...but all of the changes would likely break someone's in-use pattern. I would argue that if imageproxy wants to accept relative url paths (e.g. So it would be great to get your thoughts on this. On one hand don't want to break anyone that upgrades imageproxy's currently functioning url patterns. On the other hand it looks like adding in support for encoded url parameters may add even more edge cases that don't behave as people expect. It would be nice not to have to rely on trying to decode the remains of the path as a URL to detect whether we are looking at the url parameter yet or not. First way to solve thisJust stick with the existing logic which depends on whether or not Second way to reason this out would be the following:
The impact of this approach is that it requires the slashes to be there: A third waywould be to check if
Fourth way...break current apisI'm not particularly happy with either of the above options as it seems like they are maybe going in the wrong direction in terms of making behaviour more predictable and robust. Personally I would prefer less flexibility in how the user's build imageproxy requests (e.g. excluding options) in favor of more predictable behaviour. The "try to url decode" and if it errors indicates options are present...doesn't really give me a good feeling as the common case (options) has an extra url decode running. Granted this probably takes no time relative to fetching an image from disk (let alone remote server)...still. One way would be to no longer allow options to be optional (e.g. always have pattern Another would be to change the delimiter to make it more obvious where parameter boundaries start/end. Doubleclick/Google uses Any of these options are likely to break someone who upgrades imageproxy without updating their generated URL formats. Sorry for writing so much. What do you think about how to approach this? |
Ability to pass remote url encoded (or not). Option 4 implementation. Requires `{options}` to be present in request URL. `x` or `0x0` can be used as no-op options. The ability to exclude options was not documented in README anyway, but nevertheless a breaking change.
Hi @willnorris, happy new year! As I didn't get any feedback from you in terms of preference on how to approach this I went ahead and did an implementation for option 4. Happy for feedback if you want to go another way. I'm not a golang guy so I happy for feedback on that front as well if I missed something. I didn't bother to open a pull request because I'm not sure what your feeling is on the various approaches outlined above. |
Ability to pass remote url encoded (or not). Option 4 implementation. Requires `{options}` to be present in request URL. `x` or `0x0` can be used as no-op options. The ability to exclude options was not documented in README anyway, but nevertheless a breaking change.
Ability to pass remote url encoded (or not). Option 4 implementation. Requires `{options}` to be present in request URL. `x` or `0x0` can be used as no-op options. The ability to exclude options was not documented in README anyway, but nevertheless a breaking change.
Another benefit to supporting url encoding.... The README says:
This is due to how webservers/browsers deal with caching when query string is present. With encoded remote URL support:
...then the end-user browser and webserver can optimize caching between them more easily because there will be no query string present on the imageproxy requests (encoded as part of remote url). So if we can support URL Encoding, then we can update the README to just advise users to url encode remote urls that contain query strings to optimize caching (rather than avoiding them altogether). |
Hey @willnorris let me know if I can do anything to make this easier for you time-wise. I know I wrote a lot of ideas for discussion earlier. As I have implemented a proposed change, you can ignore the earlier thoughts. The TLDR of the change I made is:
Before this change the code would attempt to URL-parse the first param and if that failed then assumed there were options present. That meant in the common case (options and url present) both the options and url would be attempted to be URL parsed in the logic. Now we expect parseable options to always be the first parameter. The fact options were optional was not documented in the README, only in the code...so hoping this isn't a big breaking change for most users. The fix is just to add an empty/no-op option ( If you are open to this approach, let me know and I will open a pull request. Thanks! |
I went ahead an opened a PR for this (maybe that is better for you than looking at my fork)...and squeaky wheel gets the grease as they say. ;) |
- URL encoding of remote urls is now supported (but still optional). - Options parameter is no longer optional. - You can pass empty/no-op options (`//http...` or `/x/http...` or `/0x0/http...`), - but not `/http...`. - Added more test cases.
Hey, I know the docs say it isn't supported (https://github.com/willnorris/imageproxy#remote-url) and advise against using query strings in upstream image source urls.
The simple/clean thing would be to URL encode the upstream/source image url before appending it to our imageproxy url. This isn't supported today unfortunately. Perhaps there is a good reason?
This could be done either with a config switch ("always decode 1x") or by having imageproxy just check the url parameter and decode up to 2 times or until substring
http://
orhttps://
is found at index 0 of the url param. The advantage of the config switch is not changing current behaviour for existing users and potentially letting users avoid the decoding performance penalty if they don't need it. The non-config/implicit behaviour is interesting because it shouldn't decode unless it doesn't findhttp://
orhttps://
at the beginning of the remote url param...in which case I guess imageproxy would error anyway.Not sure if there are compelling reasons why passing the upstream url encoded isn't supported (since this is documented as the way it works), but hopefully you will consider the change.
Meanwhile I'm trying to just do the url decoding on the nginx side before passing to imageproxy...however this doesn't appear trivial because decoded nginx vars (like
$uri$is_args$args
are "normalized" which isn't just url decoding...also things like condensing two slashes to one (sohttps://...
becomeshttps:/...
in the$uri
var, even when passed to nginx encoded in the first place...and of course imageproxy doesn't like that either (invalid request URL: malformed URL
)The text was updated successfully, but these errors were encountered: