UCL Logo VISUAL COMPUTING

README

=====================================================================
====  Casual 3D Photography - Dataset Release                    ====
====  Peter Hedman, Suhib Alsisan, Rick Szeliski, Johannes Kopf  ====
====  SIGGRAPH Asia 2017                                         ====
=====================================================================

  Structure-from-Motion is ambiguous with respect to orientation and scale. This means that
  our sparse reconstructions produced by COLMAP do not have consitent scales and up-vectors.
  As we warp the reconstructed depth maps into a central panorama, we also correct for these
  issues and transform everything into a coordinate space, where:
    1) The scale is metric, and
    2) The up vector corresponds to (0,1,0).
    
  These stages produce outputs in the COLMAP coordinate space:
    - Sparse reconstruction, and
    - Dense reconstruction.
   
  The following stages produce outputs in the normalized coordinate space:
    - Warping
    - Stitching
    - Two-layer fusion.
   
  All of our panoramas use the equirectangular projection (polar coordinates).
   
======================================
== Scene parameters (in params.txt) ==
======================================

      panoUp            : The up-vector in the central panorama in COLMAP coordinates.
                           After normalization, this vector corresponds to (0,1,0)
      panoForward       : The forward-vector in the central panorama in COLMAP coordinates.
                          After normalization, this vector corresponds to (0,0,-1)
      panoCenter        : The center-of-projection for the panorama in COLMAP coordinates.
                           After normalization, this vector corresponds to the origin (0,0,0).
      colmapToMeters    : The scaling factor which converts the scale of the COLMAP coordinate space
                          to the ~metric scale in the normalized coordinate space.
      minDepthForScene  : The depth of the nearest depth plane in the COLMAP scale. After normalization,
                          this corresponds to 0.5 m as reported in the paper.
      maxDepthForScene  : The depth of the nearest depth plane in the COLMAP scale. After normalization,
                          this corresponds to 1500 m as reported in the paper.

  For reference, the following function transforms a point from the COLMAP
  coordinate space to our normalized coordinates:
  
      Eigen::Vector3f toNormalizedCoordinates(Eigen::Vector3f point, 
                                              Eigen::Vector3f panoUp,
                                              Eigen::Vector3f panoForward,
                                              Eigen::Vector3f panoCenter,
                                              float colmapToMeters) {
          Eigen::Vector3f panoRight = panoForward.cross(up).normalized();
          Eigen::Vector3f orthogonalPanoUp = panoRight.cross(panoForward).normalized();
          
          Eigen::Matrix3f rotation;
          rotation.row(0) = panoRight;
          rotation.row(1) = orthogonalPanoUp;
          rotation.row(2) = -panoForward;
        
          return rotation * (point - panoCenter) * colmapToMeters;
      }


==================
== Input images ==
==================

    source\*.jpg  : Our input images after pre-processing
                    (cropped, white-balanced, and the
                     color fringing has been removed).

                                             
===========================
== Sparse reconstruction ==
===========================

  As exported by COLMAP 1.1 (https://colmap.github.io/).
  
    sparse\undistorted\*.png             : Color images after undistortion.
    sparse\undistorted\*.camera.txt      : Camera intrinsic parameters.
    sparse\undistorted\*.prj_matrix.txt  : Projection matrix from world space to image coordinates.
    sparse\model\*                       : Model that can be loaded with any version of COLMAP.


==========================
== Dense reconstruction ==
==========================

    dense\depthmaps\*.png              : Our depth maps as 16bit PNGs.
    dense\geometric-consistency\*.png  : Geometric consitency maps as a binary image.
                                         (0   => these pixels failed the consistency check,
                                          255 => these pixels passed the check).

  Use the following function to load a depth map with the COLMAP scale:
  
      cv::Mat1f loadDepthMap(const std::string& path,
                             float minDepthForScene,
                             float maxDepthForScene) {
          static const float kMaxValue16 = 65535.f;
          
          cv::Mat input = cv::imread(path);
          cv::Mat1f output;
          input.convertTo(output, CV_32F, 1 / kMaxValue16);
          output = (output * (maxDepthForScene - minDepthForScene)) + minDepthForScene;
          return output;
      }

      
===========================
== Front surface warping ==
===========================

  Instead of creating separate data outputs for each dense image (depth maps, 
  geometric consistency and colors), we store UV maps for each warped image.
  These UV maps are stored using the OpenGL convention, where the range is
  [0,1]^2 and (0,0) corresponds to the bottom left corner of the image.
 
    warp\front\uvs\*.png      : UV maps warped into the central panorama as 16bit PNGs.
    warp\front\stretch\*.png  : Stretch penalty, [0-1] range stored as a 8bit PNGs.

  Use the following functions to load the UV maps and the stretch maps:
  
      cv::Mat3f loadWarpedUvMap(const std::string& path) {
          static const float kMaxValue16 = 65535.f;
          
          cv::Mat input = cv::imread(path);
          cv::Mat3f output;
          input.convertTo(output, CV_32F, 1 / kMaxValue16);
          return output;
      }
      
      cv::Mat1f loadStretchPenalty(const std::string& path) {
          static const float kMaxValue8 = 255.f;
          
          cv::Mat input = cv::imread(path);
          cv::Mat1f output;
          input.convertTo(output, CV_32F, 1 / kMaxValue8);
          return output;
      }


==========================
== Back surface warping ==
==========================

    warp\back\uvs\*.png      : UV maps warped into the central panorama as 16bit PNGs.
    warp\back\stretch\*.png  : Stretch penalty, [0-1] range stored as a 8bit PNGs.

  Use the functions above (loadWarpedUvMap, loadStretchPenalty) to load these.
   
    
===========================
== Front layer stitching ==
===========================

    stitch\front\labels.png       : The indices of the images used at every pixel, stored in 8bit PNGs.
    stitch\front\stitched.jpg     : The stitched front layer texture.
    stitch\front\disparities.png  : The stitched front layer depth, encoded as disparity (1 / meters)
                                    in 16bit PNGs. Note: These are stored using the normalized, 
                                    metric scene scale.
                                          
  Use the following function to load the disparity maps
  and convert them to normalized (metric) depth maps:

      cv::Mat1f loadPanoDepths(const std::string& path) {
          // The depth range is [0.5, 1500] meters, which makes disparities range from [~0, 2] m^1.
          static const float kMaxDisparity = 2.0f;
          static const float kMaxValue16 = 65535.f;
          
          cv::Mat input = cv::imread(path);
          cv::Mat1f output;
          input.convertTo(output, CV_32F, kMaxDisparity / kMaxValue16);
          return 1.0f / output;
      }

                                          

==========================
== Back layer stitching ==
==========================

    stitch\back\labels.png       : The indices of the images used at every pixel, stored in 8bit PNGs.
    stitch\back\stitched.jpg     : The stitched front layer texture.
    stitch\back\disparities.png  : The stitched front layer depth, encoded as disparity (1 / meters)
                                   in 16bit PNGs. Note: These are stored using the normalized,
                                   metric scene scale.

  Use the function above (loadPanoDepths) to load the disparity maps
  and convert them to normalized depths.



==========================
Two layer merging:
==========================

    merge\mesh.ply   : The final, two-layer mesh produced by our reconstruction system. 
                       Stored in the normalized coordinate space.
    merge\atlas.jpg  : The texture atlas used by our output mesh.