-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Range requests #887
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks okay as a starting point. I think it would be nicer if this could be handled transparently in Rocket regardless of the Responder
used and/or provided as another Responder
implementation, but that might be impractical since it requires Seek
.
@SergioBenitez worked a bit on range requests a while back but never completed it; it might be good to compare this with some of the work done if I can track it down.
(Thanks for the reminder on IRC that this hadn't been looked at yet)
core/lib/src/response/responder.rs
Outdated
); | ||
} | ||
let size = body.seek(io::SeekFrom::End(0)) | ||
.expect("Attempted to retrieve size by seeking, but failed."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm uncomfortable with this expect
. I think it's fine for Vec<u8>
and [u8]
for example, but I'm not so sure about File
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you suggest to "fix" this?
To me this in an "Internal Server Error", because the state of the seek pointer is unknown at this point, which means we cannot fallback anymore, because this will also have an impact on Read
.
Rocket would catch this (unlikely) panic anyway, but I guess I could create a 500-response instead and return that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I prefer to return some kind of Error
directly rather than panicking. Although looking at it, Response::set_sized_body
itself panics instead of returning Result
.
I looked at a few of the platform implementations of Seek
for File
. It looks like *nix and Windows only panic when the offset is out of range for an i64
, or for any platform-specific reasons (e.g. the File
object is in reality a handle to a pipe or socket, not a file).
My original thinking was that in the case of a "seek to get size" error we could just seek back to the start, ignoring the range request, and/or return a range not satisfiable. But I'm not sure if that's really worth the complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one seek already has failed, it is very likely, that another to the beginning will fail as well.
The current approach is my favorite, but I if you would like to see something else, just tell me.
@jebrosen Thanks for taking a quick look at it. I tried to address most of your concerns. Another Responder similar to how Providing this functionality for all bodies on the other side might not be very helpful, but would introduce a good amount of additional complexity. (E.g. I cannot think of a good reason to request a certain byte-range of a json record. Most of the time this should be solved at the user-level.) |
2069e9f
to
f76719b
Compare
This is starting to collect some dust, but I would still like to see this merged someday in any shape or form. Could I get another round of feedback please? Ping @jebrosen or @SergioBenitez |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still love to provide a way for downstream crates to leverage this, but I'm not sure where exactly to put that. I think it would be possible to do struct RangeResponder<B: Read + Seek>(B); impl<'r, B: Read + Seek> Responder for RangeResponder
, and that might also reduce some of the awkwardness I mention below.
core/lib/src/response/responder.rs
Outdated
@@ -217,34 +219,142 @@ impl<'r> Responder<'r> for String { | |||
} | |||
} | |||
|
|||
fn try_range_response<'r, B: io::Read + io::Seek + 'r>(req: Request<'r>, mut body: B) -> Result<response::Response<'r>, B> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This signature does feel really awkward, but that's mostly because of the Seek
bound, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really because of Seek (this implementation panics, if seeking fails), but to provide a way to recover of other errors. I could move those checks into respond_to
instead. That would likely fix the awkwardness, but result in some code-duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have explained that more clearly: I found it awkward because this isn't something like Response
-> Response
, but instead this function requires the caller to move the body in and out. That's why I was hoping a Responder
bounded on Seek
would make that go away.
core/lib/src/response/responder.rs
Outdated
fn respond_to(self, req: &Request) -> response::Result<'r> { | ||
let mut body = Cursor::new(self); | ||
|
||
match try_range_response(req, body) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of the respond_to
methods are harder to read than without Range
support, mostly because of that awkward function signature. I might play around with this and try to make things a bit nicer.
While this likely would look a lot cleaner, it would duplicate a lot of code and I would like to make this transparent. There is no reason not to provide range-support for types implementing Seek, because this depends on client support anyway and is 100% backwards compatible. |
My idea was to replace the entire Response::build_from(RangeResponse(self))
.header(ContentType::Binary)
.ok() That way Aside from code, we will also need:
|
Alright, I have made some improvements:
But I have also introduced a new issue in the process. The file handler is now much simpler, but not equivalent. It previously used I do not know, why the old code, did not use Seek as well, I assume that this was some kind of optimization? I also assume, that it is an oversight in that case, that I am not sure, there is a good way to restore the old implementation when implementing range responses like this. At least no good idea jumps to my mind. But we definitely want range responses for files by default in my opinion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patience with this. I think I'm pretty excited about this direction after seeing it.
The second new commit adds the RangeResponder as a public struct as you were suggesting. I am not totally sure, that this is a good idea. People might get the impression that File/&[u8]/Vec do not support Range requests by default. But I think this is solvable with good docs
This is my biggest hesitation to making it public -- documentation. We should definitely document that File
etc. support range requests now. If we make RangeResponder
public, it will also need its own documentation and tests and maybe even a different name.
But I have also introduced a new issue in the process. The file handler is now much simpler, but not equivalent. It previously used metadata to construct a Sized body manually and otherwise construct a Streaming body. The new implementation does always use Seek to determine the size (because it just uses sized_body, if no Range is requested).
I do not know, why the old code, did not use Seek as well, I assume that this was some kind of optimization? I also assume, that it is an oversight in that case, that sized_body of the ResponseBuilder uses a File in the documentation example? Because otherwise, the docs would encourage a different implementation than what Rocket does by default, which seems a little paradox?
I am not sure, there is a good way to restore the old implementation when implementing range responses like this. At least no good idea jumps to my mind. But we definitely want range responses for files by default in my opinion.
The short answer to all of this is: it's likely platform dependent, and you can probably use File
to serve things that aren't really files. But I am a bit suspicious of any file-like thing where metadata
outright fails but read
ing the file succeeds. I think it deserves investigation, and maybe we should continue to use Stream
(without Range
support) in cases where the metadata
check fails.
Side note: hyper 0.12 removes typed headers, so in that porting process we will probably have to revive the header parsing code from the last time Sergio worked on this in the range
branch.
Note that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another few math errors (thanks go to foba in IRC/Matrix who helped discover these).
fc289a3
to
18fc7e0
Compare
Awesome! I will try this out. In addition to
Just something to be handled during the hyper upgrade. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played with this for a while and finally realized the byte ranges are supposed to be inclusive. It's simpler to keep end
as an exclusive position within the code because it corresponds directly to size
, but we need to transform it on the way in and out.
Should all be fixed. @jebrosen Thanks for all the testing and debugging (also @ foba). My use-case for this PR is long gone, I hope this gets merged soon. 😄 |
Forgot to fix up the tests. Hopefully, travis is now happy again. |
@jebrosen Ping. Just in case you might have forgotten this PR. No rush. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hooray, it passes the "10 seconds test" without modification!
My biggest concerns are mostly procedural:
- Whether to apply this automatically to
&[u8]
andVec<u8>
- my inclination is no onVec<u8>
, but&[u8]
doesn't bother me as much because it could only be borrowed from global state or the request metadata. - Naming and placing
RangeResponder
But I also had a troubling thought earlier today regarding chaining Responders
. As an example, the ETag
header needs to be calculated and applied based on the entire representation, and If-None-Match
should be evaluated and possibly resolved as a 304 before a range response is even considered. If a responder or fairing processes Range
before ETag
or If-None-Match
handling, the result will be wrong. Content-Encoding
is a similar situation.
I think the only correct place for range request processing is as the last ("outermost") responder that runs, and this PR as written would break both Compress<File>
, available in rocket_contrib
, and ETag<File>
, which isn't in rocket or rocket_contrib
but implementations exist. I just wish I had realized this problem sooner :/
hyp_res.headers_mut().set(header::ContentLength(size)); | ||
let mut stream = hyp_res.start()?; | ||
io::copy(body, &mut stream)?; | ||
io::copy(&mut body.take(size), &mut stream)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergioBenitez do you know why an equivalent of this wasn't being done already? i.e. why is it allowed to pass (via set_raw_body
) a body longer than the size
was specified to be?
&[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] | ||
} | ||
|
||
mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some other interesting test case ideas: apple-banana
, 3-20
, -1-3
, 9-9
, -0
. A lot of the code here can probably be deduplicated with helper functions like assert_range_succeeds(range_spec, expected_range, expected_bytes)
and assert_range_fails(range_spec)
.
Okay, this sucks (but it is probably better to think about it now, rather than after the fact). Or is there a way to enforce the last constraint? Maybe something like a new trait This sounds really cumbersome. |
The problem is precisely with having The obvious solutions to my mind are a) making it strictly opt in or b) apply it automatically to Option c) is to do this in rocket itself automatically, but that will be a lot more work, possibly changing Option d) is something like |
I think we are not doing anyone a favor with Option b. I can either quickly change this PR to do Option a and then tool around with d (if rustc is able to infer a lot of this mess, it might be reasonable to use and just a little painful to implement) or just go for c, but I have no idea about that async-stuff. Do you have a preference for either of those? I think both have their cons and pros, but I would definitely try to tackle them. Time is not an issue, I rather do this right then to get this merged quickly. |
For now I think option a is the best choice, sadly. Unless a design for option c is relatively trivial (and I think it wouldn't be), I would like to avoid the headache of rewriting it again so soon for async. The conundrum of order-sensitive |
@jebrosen Bumping this because I'm very interested and activity on this PR has died down. (apologies if you had not forgotten about it) |
So I did write a blog post with a sad conclusion, but never really publicized or linked it. I still think "Option a", the opt-in wrapping Responder along with clear documentation explaining potential pitfalls, is the right way to go. So far having range requests be done automatically for all responses (or even just |
Oh, thank you for linking this - I had missed it in my searches. As an end user, I would be very content with option A. It would match exactly what I've been doing except without having to maintain an extra 160 lines of code for the one endpoint on which I want to support range requests. |
74113c3
to
95c981d
Compare
Just a quick note, because I have stumbled on this (rather dead) PR again: I am happy to answer questions about the current implementation, just ping me. But other then that, I will not further participate. I encourage anyone reading through the pull request to continue, were we left of. All the work and discussion is public and collected at this place. I hope this feature will be finished some time in the future. Good luck to everyone involved. :) |
Hi! I need to support a Range request in my application, so I'm basically trying to do the math on whether a cobbled implementation for one route in my application is the right move, or whether I could carry this PR across the line. I've just read the discussion above (And thankyou @Drakulix for most of an implementation) and my understanding is:
So to get this across the line I believe the actions are:
Does that sound accurate @jebrosen ? |
This is a rebase by richo of PR rwf2#887 as a starting point toward mergability.
That sounds about right. Automatic compression via
I'm in favor of this second option now, but I'd love to hear variations on these or other ideas. |
Ok that's a little different to what I was thinking of, I was planning to
add a Range wrapping responder for Seek + Read. It sounds like you're more
interested in special casing File with an extra constructor?
For once I am fairly unopinioated, I just want to make sure I work on
something you're interested in merging :)
…On Sat, Nov 14, 2020 at 09:52 Jeb Rosen ***@***.***> wrote:
That sounds about right. Automatic compression via Content-Encoding and
automatic ETag are two examples of relatively commonly used or requested
middleware that are straightforward to apply to all responses, yet would do
the wrong thing if the response was a partial range. The main options I see
here are:
- Use automatic Range handling by default on File et al. Fairings or
Responder types that inspect or change the existing response body have to
be aware of Content-Range and skip processing those responses (losing
out on the functionality they provide).
- Make automatic Range handling opt-in via APIs such as RangeResponder,
NamedFile::with_range_handling(), etc., and describe the caveats in
the documentation.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#887 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADUKAU5Z4NEFAVZZACVPSDSP27WJANCNFSM4GPZI7QA>
.
|
Ah, sorry - either way, I would like to keep a wrapping |
Gotcha, I am with you now! I will try to get a PR up for a wrapping responder, once that makes it in maybe we can discuss how to make the ergnomics better? |
I know this has been dead for four years but I would really like this feature – or at least some guidance on how to implement Range requests – for my own Rocket project. |
This is an attempt to fix #806 (at least partially).
This transparently handles
Range
requests for&[u8]
,Vec<u8>
,File
andNamedFile
, when the total size is known, allowing partial downloads.It also advertises this feature for the listed types by adding an
AcceptRanges
header.The implementation follows the MDN examples and the corresponding RFC, but only supports
Single part ranges
, which is compliant with the RFC, as a server may choose to ignore or reject any range request.Multipart ranges
would most likely be better implemented as a customBody
variant and would be more complicated. I have no use case for those, which is why I did not bother, but if that would be blocking and someone can provide some directions/suggestion on how to implement those (I am not very familiar with rocket's codebase), I would try to tackle that as well. But I would strongly prefer to do that in a separate PR.I am looking for some initial feedback on the approach. I did implement it in the corresponding
Responder
implementations to have access to theRange
header of the request and the response as well. It seemed fitting. But it would probably also be possible to just parse the header in theResponder
and just store that in theResponse
orBody
and deal with it later in the chain (when writing out theResponse
for example).Docs and Tests are completely lacking, also formatting was not working well,
cargo fmt
changed the whole codebase. So I would appreciate some more comments on what I should provide on those subjects as well.Thanks for your time :)