Version : 4.0
Author : Émile Robitaille @ LERobot
ulavalSFM is a free software manager to prepare and do structure from motion in a parallel way. It's in development. The structure from motion will be based on bundlerSFM : https://github.com/snavely/bundler_sfm. My version of BundlerSFM will though recognize the "ulavalSFM.txt" file and the new "matches.init.txt" file. Your images have to be in JPEG only .jpg ext. Use ext.sh to change the extensions and cjpg.py to find hidden PNG file to delete (both installed with ulavalSFM). If your images come from internet, I strongly suggest you to use the script cleanSFM.py to clean the directory. It will change the extension, names and find hidden PNG files.
Because it's pretty simple to install on a cluster and reliable, I use anaconda python 3.4 package http://continuum.io/downloads#34 (make sure it is python 3.4). Be sure that your python bin path is $HOME/anaconda3/bin/ and it should work on your cluster.
cd <your_image_dir>
bundler.py --no-parallel --verbose --number-cores <number_cores_u_want>
You can change some bundlerSFM options if you want. Refer to BundlerSFM repo : https://github.com/snavely/bundler_sfm.
Tested on CentOS6 with a Lustre file system. Change the code to make it work on your cluster should not be too difficult. Basically, you'll have to change the dispatcher call and the submit file construction in bundler.py.
cd <your_image_dir>
bundler.py --no-parallel --verbose --number-cores <number_cores_u_want> --cluster --walltime <walltime_u_want>
You can change some bundlerSFM options if you want. Refer to BundlerSFM repo : https://github.com/snavely/bundler_sfm.
- -h --- : Print this menu
- -v --- : Print the software version
- -l [dir] : Print information about the directory
- -n [1-*] : Specify the number of core(s) wanted (default 1, means no mpi)
- -s [dir] : To find sift features of the directory images
- -m [dir] : To match sift features of the directory images
Don't use these options for now, a simplier python script will follow to do all the algorithms :
- -c [0-1] : On cluster or not. If 1, a script .sh file will be generated (default 0)
- -b [dir] : To run bundler on the given directory
- -a [dir] : Do something alike to "-s dir", "-m dir" and then "-b dir"
-l [dir] : Will give the directory name, number of images, .key files and .mat files.
-c [0-1] : The software will use Torque msub to submit the .sh file. Not implemented yet. You will have the possibility to change the dispatcher in a configuration file.
-n [1-*] : It uses OpenMPI to launch the extern program cDoSift on multiple cores.
-s [dir] : Will do sift detection using OpenCV 2.4.9 implementation and write the features in a Lowe's binairy format.
-m [dir] : Will do match using OpenCV 2.4.9. It uses knn search to find the two best matches and uses a ratio test of 0.6 to eliminate most of bad maches. It will then pruned the double matches, pruned the outliers with a fundamental matrix found using RANSAC and compute geometric constraints needed by bundlerSFM to begin the structure from motion with a homographic matrix found using RANSAC as well. I use OpenCV to compute the matrix and the inliers.
-b [dir] : Run bundlerSFM with the options found in options.txt, automatically generated by the manager.
The four numbers before the descriptor in Lowe's sift file format are :
| X coordinate | Y coordinate | scale | angle |
Note that OpenCV does not give scale so I used size instead to make the format compatible with program like Changchang Wu's visualSFM which natively use the Lowe's format. This will not influence my program because we do not use scale in structure from motion, but keep it in mind if you want to use my sifts for other purposes.
For now, I used a parallel design based on a relation between root, workers and secretary. It is built using MPI.
There is a root which compute a distribution of all the image. Thanks to the distribution, the root gives a start point and a end point relative to a common loop to each worker (It becomes a worker too). Each worker find sift points and then write it in a <name>.key file.
There is a root which compute a distribution of all the pairs. Thanks to the distribution, the root gives a start point and a end point relative to a common double loop. The root then becomes a secretary. Since only one file will be written, it is easier and, I think, more efficient to handle one writter than handle the synchronisation needed with multi-writter. When a worker finishes a pair, it serializes the information and sends it to the secretary. The secretary quickly push back the information in a c++ vector. I do so, because the secretary have to be always free to receive information. The opposite would block some of the workers. When all the pairs are computed, the secretary write down the information in the file : "matches.init.txt". The design have a limit of images because at a certain point, the secretary could run out of memory. In the worst case, if we consider that each pairs have 8 Kb (~ 2000 matches) and the maximum RAM of the root is 3 Gb, then, it is possible to match a maximum of approximately 850 images. But, because certain pairs are discarted or because pairs usually have a lot less than 2000 matches it is possible to have much more pairs.
Don't hesitate, send me an email [email protected]