AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes
Topics
Abstract
Current learning-based Structure-from-Motion (SfM) methods struggle with videos of dynamic scenes captured by wide-angle cameras. We present anyMap, a differentiable SfM framework that jointly addresses image distortion and motion estimation. By learning a general implicit camera model without predefined parameters, anyMap handles lens distortion and estimates multi-view consistent 3D geometry, camera poses, and (un)projection functions. To resolve the ambiguity where motion estimation can compensate for undistortion errors and vice versa, we introduce a low-dimensional motion representation consisting of a set of learnable basis trajectories, which are interpolated to produce regularized motion estimates. Experimental results show that our method achieves accurate camera poses, excels in camera calibration and image rectification, and enables high-quality novel view synthesis. Our low-dimensional motion representation effectively disentangles undistortion from motion estimation, outperforming existing methods.