-
Notifications
You must be signed in to change notification settings - Fork 4
Photogrammetry
← Previous topic: Understanding Image Geometries | Next topic: Intrinsic Calibration →
The connection between the locations of objects in the world and their corresponding location in an image is described by photogrammetric relationships. The form of these relationships that we use is taken from Hartley and Zisserman [2003] and uses the concept of homogeneous coordinates. It is well described in several references but is summarized here for completeness and since many of the implementation details require this knowledge. Much of this content is taken from a UAV methods paper by Holman, Brodie and Spore [2017].
By convention, objects in the world are described by the 3D coordinates, [x, y, z] (cross-shore, longshore, vertical), while their image locations are described by the 2D coordinates, [U, V] (both are right hand coordinate systems). In a homogenous formulation, the two are related through a 3 by 4 projective transformation matrix, P, such that
(1)
The normal 2 and 3D vectors are each augmented by an additional coordinate, set to the value of 1. Thus, for any particular world location, if P is known, the image location is found by the multiplication in equation (1). In homogeneous coordinates, the answer on the left is considered to be known to a multiplicative constant. That means that the literal product of the multiplication in (1) will yield a non-unitary last component, but this is logically equivalent to what you would get by dividing by the last value, in which case the first two components are the image coordinates of the object. Thus, computation of image coordinates requires first the multiplication, then the normalization to make the last element equal to 1. There are many benefits to make up for the inconvenience of the second step.
The information in the projection matrix, P, had previously been represented in a different form, a 1 by 11 vector usually identified by the symbol m. This form of the projective equations were used by the CIL until we discovered homogeneous coordinates. They are still used in routines to convert between image and world coordinates and were the basis of the CIL photogrammetry paper by Holland et al (Holland, Holman et al. 1997).
While P has 12 elements, the last one, P(3,4), is set equal to 1 by convention. This is consistent with the idea that the entire multiplication is valid to within a multiplicative constant, i.e. dividing P by the last element yields the same projection. Thus there are only 11 degrees of freedom, equivalent to the 11 elements of m.
Conversion between m and P is done by routines m2P and P2m.
Conversion from world to image coordinates is straightforward through equation (1), but remembering that you must divide by the last element of the resulting vector to make it equal to 1. Converting from image to world coordinates is under-determined (two unknowns but three unknowns), so requires additional information. Often we just specify one coordinate, for example when we make a rectification (horizontal map) from an oblique image, we map onto the z=0 or a z=zTide surface.
The routines findUV and findXYZ carry out these calculations. Note that both use the m format of the image geometry. Both routines are included in the Support-Routines repository. There are special versions of these routines in the UAV repository that are based on equation (1) and build up P from its component parts, as described below.
The projective matrix is composed of three factor matrices,
(2)
K contains the intrinsic parameters of the camera, those that convert from angle away from the center of view into camera coordinates. R is the rotation matrix describing the 3D viewing direction of the camera compared to the world coordinates system. The final bracketed term is a 3 by 3 identity matrix, I, augmented by C, a 3 by 1 vector of the camera location in world coordinates. Taking the multiplication (equation 1) in steps, first multiplying the bracketed term in (2) by the object world coordinates causes subtraction of the camera location from the object location, effectively putting the object in camera-centric coordinates. Then multiplying by the rotation matrix rotates into directions relative to the camera look direction. Finally, multiplying by K, the intrinsic matrix, converts into pixel units for the particular lens and sensing chip. The relationship between the steps that convert from world to image coordinates and these matrix multiplications are well illustrated in Hartley and Zisserman [2003] on pages 153-157.
The process of populating the K matrix is known as intrinsic calibration and can be completed in the lab (is independent of the installation). Once calibrated, the resulting data should be stored in a database or a matlab function. For UAV work, these are currently implemented as Lens Calibration Profiles (LCPs) in files that look like makeLCPP3.m (for a DJI Phantom 3). The intrinsic calibration process is described here
The remaining six parameter are functions of the installation so cannot be determined ahead of time. They include the rotation matrix and the camera position.
The rotation matrix, R, represents the 3D rotation between world and camera coordinates. There are 3 degrees of freedom, the azimuth (taken here as the compass-like rotation clockwise from the positive y-axis), the tilt (zero at nadir, rising to 90° at the horizon), and roll (rotation about the look direction, positive in the counter-clockwise direction as viewed from the camera). The details can be found on page 612 in Wolf [1983]. Routines angles2R.m and R2Angles.m can be used to build or decompose R.
The camera location, C, is a 3 by 1 vector so has 3 degrees of freedom, its 3D world location.
Further information on extrinsic calibration procedures used to solve for these matrices are found in Extrinsic Calibration
← Previous topic: Understanding Image Geometries | Next topic: Intrinsic Calibration →
CIRN
Wiki Home
CIRN Website
CIRN Research and Workshops
CIRN Monthly Webinars Existing Monitoring Stations
Sampling Goals
Pixel Resolution
Fixed Mounting Platforms
Temporary Towers
FOV and Lenses
Installation Design
Cookbook
Data Processing
Understanding Image Geometries
Photogrammetry
Intrinsic Calibration
GCPs
Extrinsic Calibration
File Naming
Directory Naming
Time Conventions
Common Variable Names
Parallel Processing Issues
Etc
GitHub Help
GitHub Cheat Sheet
CIRN repository help
GitHub Best Practices and GuidelinesGitHub Repository Structure
GitHub Workflow Overview
Using Teams & Roles
Issues
Testing & Review
Code Requirements
General Guidance
Admin
Software Development Life Cycle