Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting a custom capture framerate #11

Open
Boscop opened this issue Sep 6, 2017 · 25 comments
Open

Setting a custom capture framerate #11

Boscop opened this issue Sep 6, 2017 · 25 comments

Comments

@Boscop
Copy link

Boscop commented Sep 6, 2017

I bought a webcam that can do 30ps Full Hd. I tested it in OBS-Studio, at first I got very low capture framerate, about 5 fps. But then I read this post and it worked: By setting the "Resolution/FPS Type" to Custom and setting Resolution to 1920x1080 I got the 30 FPS capture framerate.

Then I tried to use the webcam in my opengl application, capturing it with Escapi and rendering it to the screen.
But in my application I ALSO only get that low capture framerate with Escapi :(
(Btw, I get high FPS with a friend's webcam so it's not my application's fault.)
I talked on IRC with the developer of OBS about what OBS does when you set a custom capture FPS, it uses DirectShow functions to change the pin type. He pointed me to this:
https://github.com/jp9000/libdshowcapture/blob/master/source/device.cpp#L304

Could the same be done with Windows Media Foundations?
It seems that some cameras (like mine) use a lower capture framerate by default than what they could, so it would be useful to be able to set a custom capture framerate with Escapi, too :)

@jarikomppa
Copy link
Owner

jarikomppa commented Sep 6, 2017 via email

@codec-abc
Copy link

codec-abc commented Sep 18, 2017

I can help a bit since I am working on the same subject. The first thing you need to watch for is to find the supported "capture mode". What I call capture mode is a combination of encoding, (like RGBA32, MJPG, NV12, etc..) width, height and framerate. For that, I recommend to use a tool like graph studio next. Then, you have to note a few things:

  1. Escapi only support a few encoding. For example, MJPG is not supported and on my laptop this is only encoding running at 720p at 30 fps.
  2. Escapi does not care about framerate. If it finds 2 "capture mode" that only differs and encoding and framerate it will pick the last it encounter (if I recall correctly). I have tweaked the library a bit so it manage it well enough for my purpose. I can guide you through some code if you need to.
  3. There is post processing to convert the capture buffer (regardless of its type) to RGBA32 format. This is done only by the CPU without SIMD instructions and I think it can put some heavy workload on the CPU at high resolution and framerate.

@Boscop
Copy link
Author

Boscop commented Sep 19, 2017

I'd really appreciate your help!

  1. I downloaded and ran graphstudionext, how can I figure out the possible capture modes?
    I clicked on "add filter" and it showed 2 entries for my camera, which seems to have the same data.
    Both look like this:

    Why do both have MERIT_DO_NOT_USE?

  2. I'm interested to know how you tweaked the escapi lib to allow you to choose a different capture mode. It's this code, right?
    codec-abc@879ed81#diff-c2dd74b222de2e25dcac565744140e6f

  3. Yes, I'd appreciate if it was possible to treat the given resolution as a maximum resolution, so that it chooses the highest resolution width * height <= max_width * max_height (still fitting in the buffer / having an upper bound for the buffer size) and then it would return the found width and height, so I could do the scaling on the GPU automatically. (Creating a texture with the returned width and height, and uploading that when drawing the quad.)
    Do you have any plans to optionally eliminate the CPU heavy rescaling code in your Escapi fork?

Btw, I've been using my own Rust bindings, which I started before Escapi had official Rust bindings.
They are closer to the C API: https://github.com/Boscop/escapi-rs
The reason I'm still using them is that it's possible to get a camera's name without starting capturing with it, which I need for choosing the right camera by name before I start capturing...

@codec-abc
Copy link

codec-abc commented Sep 19, 2017

  1. You are almost there: Add the filter to the graph then right click the "Capture" pin and click "Properties" and then you will probably have a popup where you can choose the encoding, the resolution and the framerate. You need to try every combination to know what your camera can offer on your computer (the same camera might have different option with 2 different computer/OS). Each combination represent a Media SubType (This is the reason there is so many). You can do in the code too, by enumerating all the Media SubType and finding its properties (but you will need to find your way through the Windows API which is not really easy). About the MERIT_DO_NOT_USE I have no idea what it is.
  2. Yes that is part of it, but I think I will change it to give directly an ID to map to a media sub-type.
  3. I have been partly wrong on my previous remark. There is a raw capture mode in escapi (not sure if it is used though) that would allow you to bypass the conversion. Still you will need to implement the conversion it in a shader if you want to display it properly, and depending on the encoding chosen it might be difficult (impossible maybe). For instance, converting NV21 to RGB is easy in a shader but doing the same with MJPEG might not even be possible. By the way, you might want to change the code that choose the capture mode like I did here so it suits your needs. For me, I chose to convert the buffer back to a YUV format (because I need a luminance buffer to do some image processing stuff, and it is still easy to display). I need to benchmark it but if it run well enough I might leave the default conversion (I only target 720p resolution at 30 fps). And the code does not do rescaling only conversion. The author chose to output the image as an RGBA32format no matter what the Windows API throw at him (except in raw capture mode). From my point of view, it is sensible default (except maybe that RGB24 would have been better than RGBA32) but in some case the overhead of the conversion may be too high for certain use cases.

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

I was quite busy lately.. Now I tried that and when I open the properties, I get this:

It doesn't let me choose a different framerate than 5 for YUY2.
But it also has MJPG output. When I choose that, it sets the frame rate field to 30:

Btw, here it lets me change the framerate with the up/down buttons, I also can get 25, 20 and 5 fps.

So that means YUY2 is the first / default capture mode, and Escapi chooses the default / first?
And now, how can I get MJPG @ 30fps with your escapi fork?

Thanks so much!

Btw, any idea why my camera appears twice in Filters?

Btw, IMO Escapi should convert from the camera format to a consistent output format that can be easily scaled in a shader but it shouldn't scale it on the CPU.
So I think that RGB24 (and the highest resolution that fits into the given buffer) is the most reasonable default. So it would also return the width and height that it found as the highest combo that fits into the buffer (width * height * 3 <= given_buffer_size).

If you want, we can work on a fork together that does this. It could also have the raw mode that you want, but I think for most usecases the unscaled RGB24 mode is the most appropriate.

@codec-abc
Copy link

codec-abc commented Sep 27, 2017

It seems that your computer and/or camera is not able to stream in YUY2 @ 1920x1080 at 30 FPS. This is not surprising to be honest. Note that if you lowered the resolution you might have been able to chose a higher framerate in YUY2. I think this is a bandwidth problem. YUY2 is probably less space efficient that MJPG so it consumes more bandwidth and you have probably reached the limit at 5fps (for 1980x1080).

If your goal is to display and/or process the camera stream at 1980x1080 you will need to "tweak" Escapi. My fork is kind of ugly (and I tweaked some more locally, I might push those changes later) and not there yet, but I can give you some pointers.

The first thing to do is to implement the conversion for the MJPG encoding -You might want to double check if there is not another format that may be able to stream your camera at 1920x1080@30fps because MJPG is not easy to support- To do that you should add the MJPG format to the list of supported format in escapi (like here). But then you have to provide a conversion function. The conversion function should take the MJPG buffer of each frame and convert it to RGBA. Unless you planned to write a JPEG decoder yourself (which is a quite hard task), you might want to use a library. This one should work and if you want to understand the reason of the fork, look at this issue. Maybe libjpeg would work too, but if you want to find out you will have to test it by yourself. :)

At this point, you just need to make sure that Escapi selects the correct capture mode like I explained before (point 3 of my previous message). For your case you should select the capture mode with MJPG encoding, with a resolution of 1920x1080 and 30 frames per second. For testing purpose you might want to hardcode the capture mode selection. This is currently the part I want to improve on Escapi. I want to return each possible capture mode and be able to select it using some kind of id. It is a bit complicated since I want to do it from another language (C# in my case).

My camera shows twice as well in GraphStudioNext and I have no idea why, but it does not seems important.

EDIT: If you manage to do that part cleanly, you might want to summit a pull request because I will welcome the addition.

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

I tried with your fork at this commit but when I give it a desired framerate of 30, or 20 I only get a completely blue cam buffer from Escapi. Only when I give 5 for the desired framerate it works like before.
Is it trying to use YUY2 with the given framerate (30)?

@codec-abc
Copy link

codec-abc commented Sep 27, 2017

I don't think you should use my fork (at least in this state). Especially since it does not support MJPG. I don't know what happen for sure but I think Escapi does not find compatible capture mode at 1980x1080@30 fps.
EDIT:
Well, I suppose you can use my fork to test stuff if you are not afraid :). If you add fake MJPG support (by doing a dumb buffer copy) like here you should have some more "meaningful" output. It should be RGBA noise. Then, if you dump this noise in a file and save it with a .jpg extension GIMP should be able to display the image.

@codec-abc
Copy link

About your remarks on the output mode I am not sure I agree. Ideally I would like things to work that way:

  1. I ask Escapi to send me a list of connected camera with their available capture mode. For each capture mode, Escapi tells me if it would be able to convert it to a default format (let's say RGB24 for instance).
  2. I chose a camera and a capture mode (or maybe even let Escapi pick one automatically) and tell Escapi if I want the default conversion to happen.
  3. When a frame is completed Escapi send me either the raw buffer or a converted buffer depending on what I asked at the previous step. Then, what I do with that buffer is up to me. For my use case I will probably chose the raw buffer because I don't want additional CPU cost.

This way, the API is easy to use and the entry barrier quite low (for the case where the default suit you). But in case you want control you still have it without tweaking the library internals.

About the fork, if you want something that only works in your case if might not fit this library's goals. But I don't think there is a need to do it (maybe except temporary, for testing stuff). @jarikomppa seems to welcome pull requests, and I think if you come with something that works well and is simple to use he will accept the PRs. Anyway, if you need help I will try my best (in the little time I have).

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

Ah yes, getting a list of supported capture modes from escapi would be fancy.

So it would work if I just get the MJPG buffer out and then use the jpeg-decoder crate to decode every frame at the client side?

Is this what you are planning to do with your fork already?

Btw, in your TransformImage_MJPG you can do the same as in TransformImage_RGB32, i.e. MFCopyImage(aDest, aDestStride, aSrc, aSrcStride, aWidthInPixels * 4, aHeightInPixels); which will probably be faster (using memcpy). And you increased gConversionFormats from 4 to 6 but only added 1 more format (MJPG) to the gFormatConversions array. So gConversionFormats should probably be 5.

If I return the raw MJPG buffer, can it be larger than the decoded frame would be in RGB32 (size of user-provided buffer)?
If not, what was the reason why you added the bufferLength arg to all conversion functions?

@codec-abc
Copy link

codec-abc commented Sep 27, 2017

Yes I think it should work if you had the MJPG as a supported format and then send the buffer through the jpeg-decoder crate. This is indeed what I did with my fork (among other things, like getting the available capture mode). You just need to be cautious as the buffer length will probably vary at each frame because of the compression. I think it will always be smaller than the allocated one by Escapi (which should be 1920x1080x4 bytes) so you are pretty safe about access violation stuff. Yet, I am not sure that if you pass trailing (garbage) bytes to the jpeg-decoder it will work properly. So just give it the exact numbers of bytes took by the JPG frame.

The code of my "fork" is indeed neither clean nor optimal that is why I tried to explain you the overall pictures before sending you piece of code :) . memcpy make sense for copying the MJPG buffer (and I think I use it in my more up to date local version) but MFCopyImage is at best confusing. It gives the illusion of dealing with pixels while you are dealing with compressed data.

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

Ah I see. Yea, it would be cool to get MJPG support working. Not sure if the MJPG frame has header info that tell the decoder how long the compressed frame data is. If it has a header, we can just always copy the whole buffer.

@codec-abc
Copy link

codec-abc commented Sep 27, 2017

You might want to look at that line. It seems that I found a way to find out the actual MJPG buffer length. Sadly, I don't remember anymore and my code being poorly documented it is not very helpful.
EDIT: Indeed, calling Lock on the IMFMediaBuffer retrieve the length of the valid data in the buffer. (source)

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

I just tried with this function (same as TransformImage_RGB32):

void TransformImage_MJPG(
	BYTE*       aDest,
	LONG        aDestStride,
	const BYTE* aSrc,
	LONG        aSrcStride,
	DWORD       aWidthInPixels,
	DWORD       aHeightInPixels
	)
{
	MFCopyImage(aDest, aDestStride, aSrc, aSrcStride, aWidthInPixels * 4, aHeightInPixels);
}

And I get a access violation when it calls mConvertFn in CaptureClass::OnReadSample().
Hm, why would it cause access violation?

@codec-abc
Copy link

codec-abc commented Sep 27, 2017

No idea but like I said MFCopyImage does not make sense for MJPG buffers. You should try with

memcpy(aWidthInPixels * aHeightInPixels * 4);

instead. But you will have "garbage" bytes after the end of the actual JPG buffer. I recommend that you look at my previous message and get the correct length of the buffer.

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

Thanks, I'm getting the MJPG data now, I can see it as noise in my opengl window.
So I'm trying to use jpeg_decoder now, but I get:

panic occured: "frame decoding failed: Format("first two bytes is not a SOI marker")"

Caused here:
https://github.com/kaksmet/jpeg-decoder/blob/master/src/decoder.rs#L131

How much is MJPG different from JPG?

@codec-abc
Copy link

codec-abc commented Sep 27, 2017

You are almost there, noise is indeed the expected display of the MJPG buffer if you treat it as a RGBA texture. MJPG is not very different from JPG. The only difference I noted is the lack of huffman tables in the MJPG (but this is not even standardized). MJPG is so similar to JPG that if you dump the bytes of a frame in a file with the .jpg extension many programs should be able to display it properly (including the GIMP, chrome and Firefox). I suggest that you dump the few first bytes of the buffer and post them here. There should be like a magic number at the beginning to identify the buffer as a JPG encoded one. Moreover, the first few bytes have generally a pattern that is pretty noticeable. About what is really happening, my guess is that there is an offset somewhere which screw up the parser.

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

I figured out that some frames are all zeroes for some reason. Maybe only the first frame(s)?
Anyway, I added a check for that so I skip those frames, and only process frames that start with 0xFF and 0xD8 (SOI) and I could see a short flash of my webcam image on my opengl window!!!
But then it crashed like a second later on a frame with:

"frame decoding failed: Format("no marker found where RST7 was expected")"

When I rerun it, it panics with different RST numbers expected, always after a few seconds, also sometimes these:

"frame decoding failed: Format("found RST7 where RST0 was expected")"

Or:

"frame decoding failed: Format("found RST3 where RST4 was expected")"

And as you can see, there is an artifact in the frame, what could cause this? (It looks horizontally flipped):

So it decodes a few frames successfully before it crashes. Sometimes it runs a few seconds, sometimes less than 1s. Any idea why?

Btw, this is my decoding code each frame:

		if self.cam.is_capture_done() {
			self.cam.do_capture();
		}
		use std::slice;
		use jpeg_decoder::*;
		let data = unsafe { slice::from_raw_parts(self.cam_frame.as_ptr() as *const u8, self.cam_frame.len() * 4) };
		if !(data[0] == 0xFF && data[1] == 0xD8 /* SOI */) { return Ok(()); }
		let mut decoder = Decoder::new(data);
		let decoded = decoder.decode().expect("frame decoding failed");
		self.cam_tex.main_level().write(Rect {
				left: 0,
				bottom: 0,
				width: CAM_WIDTH as u32,
				height: CAM_HEIGHT as u32
			}, texture::RawImage2d {
				data: Cow::Owned(decoded),
				width: CAM_WIDTH as u32,
				height: CAM_HEIGHT as u32,
				format: texture::ClientFormat::U8U8U8
			}
		);

Update: I now got a perfectly decoded frame for a second before it crashed with:

"frame decoding failed: Format("found RST5 where RST6 was expected")"

@codec-abc
Copy link

You have made great progress. I did not run into those issues and I am mostly clueless about the artifact, the fact that the whole image is flipped or the crash. My advice would be to make sure that there is no threading issue (ie the decoder is not trying to decode a frame while the next is being written in the same buffer as the same time). Maybe you could upload your code so I can give a look at it in case I see something obvious.

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

Isn't escapi running the the same thread as my decoder?

@codec-abc
Copy link

codec-abc commented Sep 27, 2017

I think it is but that is the only thing that come to my mind.

EDIT: The jpeg crate has been updated to support decoding of MJPG frame, so it recommend to use it instead of my own.

EDIT2: Can you post the code for the "JPG buffer copy" you used inside the C/C++ Escapi dll?

@Boscop
Copy link
Author

Boscop commented Sep 27, 2017

You were right! I shouldn't call do_capture() before reading the current frame.
I got it working perfectly with this now:

		if self.cam.is_capture_done() {
			use std::slice;
			use jpeg_decoder::*;
			let data = unsafe { slice::from_raw_parts(self.cam_frame.as_ptr() as *const u8, self.cam_frame.len() * 4) };
			if !(data[0] == 0xFF && data[1] == 0xD8 /* SOI */) { return Ok(()); }
			let mut decoder = Decoder::new(data);
			let decoded = decoder.decode().expect("frame decoding failed");
			self.cam_tex.main_level().write(Rect {
					left: 0,
					bottom: 0,
					width: CAM_WIDTH as u32,
					height: CAM_HEIGHT as u32
				}, texture::RawImage2d {
					data: Cow::Owned(decoded),
					width: CAM_WIDTH as u32,
					height: CAM_HEIGHT as u32,
					format: texture::ClientFormat::U8U8U8
				}
			);
			self.cam.do_capture();
		}

Thanks a lot! Now I get 30fps, but in a really hacky way...
I have no control over Escapi's output format, so my code makes the assumption that when I pass 30 as framerate, it gets MJPG data, but that's only valid for my current camera..
We really need a better interface for this kinda stuff...

Btw, there is also this lib: https://github.com/jp9000/libdshowcapture
It's written by the guy who wrote OBS and it supports all those pin formats already.
It uses DirectShow instead of MediaFoundations, but DirectShow is not going away, and from what I hear, people say that MF is worse designed than DirectShow...

@codec-abc
Copy link

I agree that the current code of Escapi does not make it easy for what we are trying to achieve but after looking at the libdshowcapture I am not sure it will be simpler. This library seems to be low level wrapper around the DirectShow API and I have no idea how to get a camera stream using the library. Honestly, I don't think there is really a single blocking issue to make Escapi evolve to a point where it can be used easily for our use cases. There is quite a lot to do for sure and it would probably take a bit of time to do it, but I don't think it would be very hard by itself.

@Boscop
Copy link
Author

Boscop commented Oct 2, 2017

Btw, libdshowcapture doesn't seem very complicated:

<me> does libdshowcapture convert MJPG into rgb?
<Jim> yes
<me> thx. btw, how to get the camera video with libdshowcapture?
<Jim> call setvideoconfig/setaudioconfig, connect filters and check return value, then call start
<Jim> Device::SetVideoConfig, Device::SetAudioConfig, Device::ConnectFilters, then Device::Start and Device::Stop
<Jim> you can call Device::EnumVideoDevices (static) to enumerate the available devices
<Jim> then you use that information to fill out your VideoConfig value to pass to SetVideoConfig
<me> and it converts the camera's video format to rgb24 no matter which format it is?
<Jim> you have to specifically set the video format to XRGB, and then it will automatically find an intermediary filter to convert it to RGB
<me> ah ok, thx

@codec-abc
Copy link

Thanks for the info. If you manage to do a sample using this library please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants