<p>The Mirror Environments (MirrEnv) dataset is made up of RGBD image sequences with ground-truth camera localization data. Three sizes of mirrors were used (small, medium, large), where some sequences had the mirror visible, covered by a green card, or had the mirror removed from the scene entirely (7 variations in total). Each of these 7 mirror presence varieties was combined with 7 pre-calculated robot arm trajectories, giving a total of 49 sequences in total. Binary masks for the mirror region are provided for every 10th frame of sequences containing mirrors.</p><p>The dataset structure is as below. Each sequence is labeled as <code>Trj_X1_X2_X3_X4</code>, where <code>X1</code> is the numerical index across all 49 sequences; <code>X2</code> is the trajectory name; <code>X3</code> indicates the mirror size; <code>X4</code> indicates if the mirror is visible (W) or covered (C). If there is no mirror, then <code>X3</code> and <code>X4</code> are replaced with <code>No_Mirror</code>.</p><p><code>Trj_X1_X2_X3_X4</code><br>|----<code>calib</code><br>| |----<code>images</code><br>| | |----<code>depth</code><br>| | | |----<code>#timestamp#.png</code> (16-bit greyscale image, 1280x720)<br>| | | |----<code>......</code><br>| | |----<code>rgb</code><br>| | | |----<code>#timestamp#.png</code> (24-bit rgb image, 1280x720)<br>| | | |----<code>......</code><br>| | |----<code>rs_intrinsics.xml</code><br>| |----<code>poses</code><br>| | |----<code>#timestamp#.txt</code> (4x4 rigid body transformation relative to robot base)<br>| | |----<code>......</code><br>| |----<code>calib_X.txt</code><br>| |----<code>DepthFactor.txt</code><br>| <br>|----<code>trj</code><br>| |----<code>images</code><br>| | |----<code>depth</code><br>| | | |----<code>#timestamp#.png</code> (16-bit greyscale image, 640x480)<br>| | | |----<code>......</code><br>| | |----<code>rgb</code><br>| | | |----<code>#timestamp#.png</code> (24-bit rgb image, 640x480)<br>| | | |----<code>......</code><br>| | |----<code>masks</code> (only available in sequences containing mirrors, ie. X4 = W)<br>| | | |----<code>#timestamp#.png</code> (8-bit binary image, 640x480)<br>| | | |----<code>......</code><br>| | |----<code>c_names.txt</code> (filenames of rgb frames - timestamp in seconds since Unix epoch)<br>| | |----<code>d_names.txt</code> (filesnames of depth frames - timestamp in seconds since Unix epoch)<br>| | |----<code>depth.mp4</code> (original video of depth frames)<br>| | |----<code>rgb.avi</code> (original video of rgb frames)<br>| |----<code>poses</code><br>| | |----<code>Qposes.txt</code> (Camera pose relative to the base of the robot base, XYZ position in metres and unit quaternion pose)</p><p>The RGB frames and depth frames have slightly differing timestamps. During experiments, they were associated together using the <code>associate.py</code> Python script available from the <a href="https://cvg.cit.tum.de/data/datasets/rgbd-dataset/tools">TUM RGBD dataset tools</a>.<br></p><p><br></p><p>Research results based upon these data are published at https://doi.org/10.1007/s41095-022-0329-x<br></p>
Funding
DTP 2020-2021 Cardiff University (2020-10-01 - 2025-09-30); Graham, Kim. Funder: Engineering and Physical Sciences Research Council
Reflection Aware Visual Simultaneous Localization and Mapping (RA-vSLAM) (2020-10-01 - 2024-09-30); Herbert, Peter. Funder: Engineering and Physical Sciences Research Council
Images can be opened in most image viewers, or even just Python; .txt and .xml can be readable with a text editor, and the .mp4 and .avi can be opened in any media player with a suitable codec installed.