-
Notifications
You must be signed in to change notification settings - Fork 31
Getting started with Vision #7
Comments
The vision system as a whole can be confusing. It is entirely possible to implement the algorithm in that paper without knowing anything about the vision system. I would suggest initially writing the code in a sandbox and later integrating it into the system. It will be easier to test and play around with if it stands alone at first. If you want to know about the architecture, I can try and describe it in some more detail later. One of the main things to note is that the vision system has a set of detectors which can individually be turned on and off. When a detector is on, it is fed images from one of the cameras. It processes this image and publishes results as events which can be picked up by the python code, the estimation code, etc. The set of available detectors is hard coded. The detectors are configurable though. Code to look at in order to understand the architecture.
Code to look at to understand how we currently find things from a technical standpoint.
Keep asking questions as you run into them. There are many places where architectural improvements could be made. I personally dislike the Image interface. I think you should use the new OpenCV API i.e. cv::Mat. Somewhere, we should add a conversion function to go between cv::Mat and Image. |
If you want to look at how a detector works and uses all those together, it might be good to trace through the BuoyDetector code (start at the update() function), since that is a good standard example of how we currently use the color filter/blobs. In general how it works is it grabs an image, then processes it for red blobs and publishes an event if it found anything from there, and will go through the same process for green, and then yellow. What that process consists of is first grabbing the config values for the "positive" range of pixel values (YUV in this case, but other color spaces like RGB can be used), these config values are the ones we find using the VisionTool, and are essentially user-defined pixel value ranges for what we will consider "acceptable" as that color buoy we are looking for. With those config values, we will iterate through each pixel of the image and determine if each color channel for that pixel falls within the range, and create a binary/boolean image (an image with 1 bit corresponding to each pixel). This is done using the ColorFilter class. We then take this binary image and toss it into the BlobDetector class, which iterates through the image and groups together any positive/white pixels that are touching. The BlobDetector will then spit out several "blobs", which is data for each group of positive/white pixels, such as x-coordinate of the left-most pixel, y-coordinate of the highest-pixel, how many positive pixels are there, etc. With these blobs, the detector looks through each one and sees if it fits with other config values (the blob must be of a certain size, and much have a certain percentage of pixels). If it fits all of them, then we accept that as a buoy and publish information for it (note: if i remember correctly, we search by largest first, and only publish one of each kind of buoy. So we publish the largest one we find and ignore any other blobs that may be positive.) The Bin's vision also does some similar stuff (at least in that it uses a basic color filter), but it also does some extra stuff. I'll explain that in detail too in a later post. And yes, definitely work in the sandbox at first, you don't want the added challenge of system integration when trying something new. Also the more C++-compliant OpenCV API (cv;:Mat and the like) is extremely easier to use, but you may have trouble finding examples using it. Avoid using IplImage, in my opinion. I updated the build system to use a new API version a while ago (I think currently we're stuck on 2.1?) and a lot has changed, so when I get around to double-checking my work and committing that, be prepared that there may be some hiccups. |
Alright, sounds good. I'll give BuoyDetector as well as those other files a look over soon, and get some practice with the sandbox. |
I've made a wiki page for the important information that comes out of this discussion. https://github.com/robotics-at-maryland/tortuga/wiki/Vision-System |
A suggestion of a good project to get familiarity with OpenCV that would be really helpful is to update the code base to work with a newer version of OpenCV. Compile and work with either 2.3 or 2.4. Install them as a dependency to /opt/ram/local and upload them to the site. Modify the bootstrap script to not download the ones from the Ubuntu repository and modify CMake to use these files instead. You can just hard code this for now. I can likely help more with the CMake files at some point, but doing anything non-trivial in CMake is a pain that requires digging through documentation for 3 hours and then killing yourself, which I wasn't planning to do this week. Honestly, familiarity with OpenCV will likely get you pretty far. Although you do need to understand some of the concepts that exist in OpenCV, you don't need to code them yourself (unless that helps you understand them). |
Yeah, what I've mostly been doing is looking through the tutorials on the OpenCV site to get a familiarity with all the concepts. I'll give updating to OpenCV 2.4 a go at some point. Are there any other good resources that you would suggest? |
I would stay the steps are as follows:
|
Compiling OpenCV 2.4 should be a breeze. You probably just want a vanilla version without worrying about adding optional libraries like Eigen or CUDA. If you have FFMpeg installed from the ubuntu repository, it should pick it up.
That should just work. It will place a FindOpenCV.cmake script into /opt/ram/local/share There aren't many good tutorial-like resources for the new API. The API documentation itself is pretty decent though. Please use the new API everywhere possible.
I'll transfer the tutorials i've put on the old wiki to this one. If you can't figure out how to do something with the new API that you think you should be able to do, just ask. It will help me expand the tutorials. As a note that will probably help you get the current code working with 2.4, CV_RGB can be found in "highgui.h" |
So, up until now I've been mostly working with the AI code. I've been meaning to learn how to work with the vision code for a while, but I've been too busy working with the new software people and doing things for classes. Now that it's winter break, I can actually devote time to learning how the vision system works. Specifically, I was going to try and implement the underwater vision filter from the article one of you brought up some time ago. I've already looked into OpenCV a little bit. As for our code, Eliot told me that you were the people to go to for advice.
So, my question is, what should I do to get started with learning about the vision system? What kind of things should I look at to get a feel for how it works?
The text was updated successfully, but these errors were encountered: