COSC 450 Assignment 2:
15% of your Final Grade Due: 23:59 Friday 30 April 2021
In this assignment you will implement a basic stereo depth estimation system:
• Implement a basic stereo matching algorithm for rectified image pairs.
• Reconstruct the 3D co-ordinates of the scene.
• Experiment with the parameters of the system.
• Implement and experiment with a way to remove incorrect correspondences.
The process of taking a stereo image to a 3D model has several stages. The first two are
implemented for you:
• Stereo camera calibration. As well as the intrinsic properties (the camera calibration
matrix, K, and distortion coefficients, 𝐝) of each camera, the rotation, R, and
translation, 𝐭, between the two cameras is computed. The calibration of the stereo
camera you can use for this assignment has been precomputed, and provided as the
file calibration.txt
• Stereo image rectification. As we will discuss in lectures, the corresponding point to
a location in one image is constrained to a line (the epipolar line) in the second image.
It is computationally easier if these are horizontal lines, so a point (𝑥1, 𝑦1) in the first
image corresponds to some point on the row 𝑦2 = 𝑦1, in the second image. Warping
the images so that this is the case is known as stereo rectification, and the appropriate
warps can be computed from the stereo camera calibration data. In OpenCV1
this is
done with the functions stereoRectify and initUndistortRectifyMap, then
the warp is applied with remap.
1 https://opencv.org/
Original stereo pair (top) and rectified pair (bottom)
The next two steps you will need to implement yourself:
• Disparity estimation. The disparity is the apparent motion between the
corresponding points in the two images. For rectified images this is always positive for
points in front of the cameras, and is larger for nearer objects. While there are quite
complex stereo disparity estimation algorithms in the literature2
, a simple sum of
squared differences (SSD) or sum of absolute differences (SAD) over a sliding window
is sufficient.
Disparity map estimated from the images above –
Brighter points have larger disparity, so are closer
• Compute 3D coordinates. Disparities have an inverse relationship to distance from
the camera, and so can be converted into 3D coordinates. See the lecture notes for
details on how 3D coordinates can be computed from disparities.
Once the 3D coordinates have been computed, the data can be written to a file. A common
format is a point cloud – essentially a list of 3D co-ordinates which may have other data (such
2 https://vision.middlebury.edu/stereo/eval3/
as colour) associated with them. A simple file format for storing this data is the PLY format3
,
which has an ASCII version that is easy to write directly, or there are various libraries available
for managing them. PLY files (and many other 3D formats) can be viewed in MeshLab4
. The
sample code writes a PLY file directly, and ignores points with negative 𝑍 values.
A 3D view of the model produced from the stereo pair.
The models produced by simple block matching with the SSD or SAD are quite noisy, so the
last thing you need to do for the assignment is to experiment with some way to reduce the
number of errors. Possibilities include, but are not limited to:
• Thresholding the SSD or SAD: Correct matches should give low values for the SSD or
SAD. Rejecting points where the best match has a relatively high value can remove
some errors. This requires a threshold value which can either be absolute (reject
points with an error over some fixed value) or relative (keep some percentage of
points with the best matches).
• Rejecting ambiguous matches: As with SIFT feature matching, you might reject those
points where the second-best match is similar to the best match as they are
ambiguous. You need to be careful here, however, as if the correct match is at
disparity 𝑑, you can get very similar values for 𝑑 + 1 and 𝑑 − 1. An alternative is to
consider the error as a function of disparity, and reject cases where there are multiple
similar minima in the curve.
• Rejecting matches that are inconsistent with their neighbours: The world tends to be
made of continuous surfaces, so the disparity (equivalently depth) at a point is usually
similar to its neighbours. Rejecting points that don’t agree with most of the points in
some local neighbourhood can remove a lot of bad matches. This requires defining
the local neighbourhood, and determining what level of agreement is required.
3 http://paulbourke.net/dataformats/ply/
4 https://www.meshlab.net/
Disparity map and 3D view of the model produced by OpenCV’s block matching algorithm.
Note that there are many points without disparities, but those that remain are correct.
Sample Code
Sample code is provided which reads a stereo pair from an .MPO (multiple picture object) file,
which is essentially two .JPG images concatenated together. It also reads the calibration data
from a text file. From the calibration data it computes rectified images, which may also be
scaled in size. Scaling the images down can improve runtime, but also reduces the effects of
image noise or inaccuracies in the calibration estimate. Several sample .MPO files are also
provided, and you can borrow the camera to capture additional images.
Capturing Test Data
The stereo camera will be available for capturing test data by arrangement. However, you should demonstrate a basic level
of performance on the sample images provided before you capture additional test data. Since
there is only one camera available it will only be made available for short periods of time, and
booking in advance is essential.
Assignment Requirements
You should submit the following:
• Source code for your solution. This could just be an updated version of the file
stereoDepthMain.cpp, but you may wish to write additional classes. You should
not use (or need) any third-party libraries or code.
• A report, covering the following material:
o How your code solves the problems of disparity estimation and 3D point
computation.
o The parameters that control the 3D reconstruction process, what effect they
have, and how they should be chosen. It is expected that you would need to
conduct some experiments to determine this information, and you should
present those experiments clearly.
o An evaluation of how effective your solution is, including discussion of cases
that do and don’t work well with your implementation.
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。