The robot exists in planar environment. This is an approximation as the robot has a finite height above the table on which it moves. Provided the distance of the camera from the robot is sufficiently large in comparison to this height, the error from this approximation is acceptable.
The camera views scene from an arbitrary position. The frame grabbed from it is a second 2D environment. To infer the robot position from the frame buffer, it is necessary to know the transformation between the two planes. This transformation can be decomposed into three matrix operations on the homogeneous coordinates of the robot position.
Homogeneous coordinates are a method of representing point in n-space n+1-dimensional vectors with arbitrary scale. They have two inherent advantages in the application:
1. To return from the homogeneous coordinate to the n-space point, it is necessary to divide the first n elements of the vector by the (n+1)th. This allows certain non-linear transformations, such as projective one, to be represented by a matrix multiplication.
2. The second advantage is that an addition/subtraction operation in n-space can also be condensed into a single matrix multiplication.
The first matrix multiplication is the rigid-body transformation from world-centred coordinates on the table, to camera-centred coordinates. This is a transform from the 3-element homogeneous coordinate representing the 2D point on the ground plane into a 4-element coordinate reflecting its 3D position in camera-centred coordinates.
No comments:
Post a Comment