A Concise P3P Method Based on Homography Decomposition


Shen Cai1*, Yao Huang1, Siyu Zhang1, Zhijun Fang1, Jiachun Wang1,2, Hao Yin1, Shibin Xie1, Junchi Yan3, Shuhan Shen4,5*,

1Donghua University    2Huawei Co.    3Shanghai Jiao Tong University    4Institute of Automation, Chinese Academy of Sciences    5University of Chinese Academy of Sciences.

Abstract


This paper introduces a novel perspective-3-point (P3P) method based on homography decomposition. We propose that this method is concise due to several reasons: 1) A fundamental geometric relationship, specifically the similarity of two triangles, is derived from our decomposition of homography, leading to two quadratic constraints on two unknown variables. 2) The proposed homography decomposition is versatile, unifying most previous constraints and variables. 3) The similarity of two triangles is further explored to promptly remove ambiguous solutions of the quadratic constraints, which avoids recovering multiple poses and thereby saves computation. 4) The proposed resolution process is streamlined, requiring less computational cost and only one square root operation, aside from commonly solving a univariate cubic or quartic equation. Experimental results validate the computational efficiency and numerical accuracy of our method.



Constraints Construction


The following figure describes the similarity relationships between the object triangle $\triangle{M\!N\!P}$ and the parallel image-related triangle $\triangle{\widetilde{M}\!\widehat{N}\!\widehat{P}}$, we derive from homography decomposition. Here, $M$, $N$, and $P$ are three object points, while $\widetilde{M}$, $\widetilde{N}$, and $\widetilde{P}$ are the corresponding image points. The world frame, the camera frame, and the object frame are denoted by $\Lambda$, $\Gamma$, and $\Phi$, respectively.



The homography $\mathbf{H}$ between the object plane $\mathcal{O}$ and the image $\mathcal{I}$ is derived as \begin{equation} \begin{split} \mathbf{H} &\equiv \mathbf{A} \;[_{ \Phi}^{ \Gamma}\mathbf{R}_{(:,1)}, \,_{ \Phi}^{ \Gamma}\mathbf{R}_{(:,2)}, \,_{ \Phi}^{ \Gamma}\mathbf{t}] = f\;&[\widetilde{\mathbf{m}}_{(\overline{2})}^{ \mathcal{I}}, \widetilde{\mathbf{n}}_{(\overline{2})}^{ \mathcal{I}},\widetilde{\mathbf{p}}_{(\overline{2})}^{ \mathcal{I}}] \begin{bmatrix} \lambda_{ M} & & \\ & \lambda_{ N} & \\ & & \lambda_{ P} \end{bmatrix} [\mathbf{m}_{( \overline{2})}^{ \mathcal{O}}, \mathbf{n}_{ (\overline{2})}^{ \mathcal{O}}, \mathbf{p}_{( \overline{2})}^{ \mathcal{O}}]^{\mathrm{-1}}. \end{split} \end{equation}

After further manipulations, the following matrix form containing geometric constraints can be obtained \begin{equation} \begin{split} &[_{ \Phi}^{ \Gamma}\mathbf{R}_{ (:,1)}, \,_{ \Phi}^{ \Gamma}\mathbf{R}_{ (:,2)}, \,_{ \Phi}^{ \Gamma}\mathbf{t}]\begin{bmatrix} x_{ N}^{ \Phi} & x_{ P}^{ \Phi} & \\[0.5mm] & y_{ P}^{ \Phi} & \\ & & 1 \end{bmatrix} = \,\, &\lambda_{ M}\,[\widetilde{\mathbf{n}}_{ (3)}^{ \Gamma}, \widetilde{\mathbf{p}}_{ (3)}^{ \Gamma}, \widetilde{\mathbf{m}}_{ (3)}^{ \Gamma}] \begin{bmatrix} \lambda_1 & & \\ & \lambda_2 & \\ -1 & -1 & 1 \end{bmatrix}. \end{split} \end{equation}

Since $\triangle{M\!N\!P}$ is similar to $\triangle{\widetilde{M}\!\widehat{N}\!\widehat{P}}$, the dot products of the corresponding sides satisfy the following relationships \begin{equation} \begin{split} \frac{d_{ \widetilde{M}\!\widetilde{N}\!\widehat{P}}}{d_{ \widetilde{M}\!\widehat{N}}} &= \frac{d_{ M\!N\!P}}{d_{ M\!N}},   \frac{d_{ \widetilde{M}\!\widehat{P}}}{d_{ \widetilde{M}\!\widehat{N}}} &= \frac{d_{ M\!P}}{d_{ M\!N}}. \end{split} \end{equation}

As a result, two quadratic constraints containing the unknowns $\lambda_1$ and $\lambda_2$ can be constructed.



Direct Expression of Complete Rotation


The complete rotation from world to camera is directly expressed with the vectors and variables, most of which are shown in the above figure. \begin{equation} \begin{split} _{ \Lambda}^{ \Gamma}\!\mathbf{R} = _{ \Phi}^{ \Gamma}\!\mathbf{R} \, _{ \Lambda}^{ \Phi}\!\mathbf{R} = \frac{\lambda_{ M}}{d_{ M\!S}} [\overrightarrow{\widetilde{M}\widehat{N}}^{ \Gamma}, \overrightarrow{\widetilde{M}\widehat{P}}^{ \Gamma}, \overrightarrow{\widetilde{M}\widehat{S}}^{ \Gamma}] \begin{bmatrix} d_{ M\!P} & -d_{ M\!N\!P} & \\ -d_{ M\!N\!P} & d_{ M\!N} & \\ & & \lambda_{ M} \end{bmatrix}\!\! \begin{bmatrix} {\overrightarrow{MN}^{ \Lambda}}^{\mathrm{T}} \\ {\overrightarrow{MP}^{ \Lambda}}^{\mathrm{T}} \\ {\overrightarrow{MS}^{ \Lambda}}^{\mathrm{T}} \end{bmatrix}\!. \end{split} \label{equ:rotationExpression} \end{equation}



Early Verification based on 3D Similarity Error Introduced by the Fourth Point.


The existing P3P+RANSAC frameworks sample three or four point correspondences as a minimal sample in each iteration. We revisit the problem of pose ambiguity removal when sampling four point correspondences per iteration. By introducing the fourth point correspondence, tetrahedrons ${M\!N\!P\!Q}$ and ${\widetilde{M}\!\widehat{N}\!\widehat{P}\!\widehat{Q}}$ are similar as depicted below




Without the pose being recovered, we can use the solved $\lambda_1$ and $\lambda_2$ to calculate a 3D similarity error to remove the ambiguous solutions of two quadratic constraints in advance. We have experimentally verified that this error is as robust as the commonly used 2D reprojection error in P3P+RANSAC pipeline.



Experimental Results


1. Numerical Accuracy


The following table shows the numerical accuracy on the noise-free synthetic data.



2. Running Time


The following table shows the running time of different P3P algorithms.



3. Real Localization Experiments


The following table shows the localization results within the vanilla RANSAC framework on the Cambridge Landmarks dataset.





Four-Aixs-Rotation Decomposition


By utilizing the coplanarity of $MN$ and $\widetilde{M}\widetilde{N}$, we can decompose the rotation from the object frame to the camera frame into four sequential rotations around frame axes: \begin{equation} _{ \Phi}^{ \Gamma}\!\mathbf{R} = \, _{ \Theta}^{ \Gamma}\!\mathbf{R}_z(\theta) \,\, _{ \Omega}^{ \Theta}\!\mathbf{R}_x(\omega) \,\, _{ \Psi}^{ \Omega}\!\mathbf{R}_y(\psi) \,\, _{ \Phi}^{ \Psi}\!\mathbf{R}_x(\phi). \label{equ:rotationOurExpression} \end{equation} The video below provides an animation of the proposed rotating process.