A Concise P3P Method Based on Homography Decomposition

Abstract

This paper introduces a novel perspective-3-point (P3P) method based on homography decomposition. We propose that this method is concise due to several reasons: 1) A fundamental geometric relationship, specifically the similarity of two triangles, is derived from our decomposition of homography, leading to two quadratic constraints on two unknown variables. 2) The proposed homography decomposition is versatile, unifying most previous constraints and variables. 3) The similarity of two triangles is further explored to promptly remove ambiguous solutions of the quadratic constraints, which avoids recovering multiple poses and thereby saves computation. 4) The proposed resolution process is streamlined, requiring less computational cost and only one square root operation, aside from commonly solving a univariate cubic or quartic equation. Experimental results validate the computational efficiency and numerical accuracy of our method.

Constraints Construction

The following figure describes the similarity relationships between the object triangle $\triangle{M\!N\!P}$ and the parallel image-related triangle $\triangle{\widetilde{M}\!\widehat{N}\!\widehat{P}}$, we derive from homography decomposition. Here, $M$, $N$, and $P$ are three object points, while $\widetilde{M}$, $\widetilde{N}$, and $\widetilde{P}$ are the corresponding image points. The world frame, the camera frame, and the object frame are denoted by $\Lambda$, $\Gamma$, and $\Phi$, respectively.

The homography $\mathbf{H}$ between the object plane $\mathcal{O}$ and the image $\mathcal{I}$ is derived as \begin{equation} \begin{split} \mathbf{H} &\equiv \mathbf{A} \;[_{ \Phi}^{ \Gamma}\mathbf{R}_{(:,1)}, \,_{ \Phi}^{ \Gamma}\mathbf{R}_{(:,2)}, \,_{ \Phi}^{ \Gamma}\mathbf{t}] = f\;&[\widetilde{\mathbf{m}}_{(\overline{2})}^{ \mathcal{I}}, \widetilde{\mathbf{n}}_{(\overline{2})}^{ \mathcal{I}},\widetilde{\mathbf{p}}_{(\overline{2})}^{ \mathcal{I}}] \begin{bmatrix} \lambda_{ M} & & \\ & \lambda_{ N} & \\ & & \lambda_{ P} \end{bmatrix} [\mathbf{m}_{( \overline{2})}^{ \mathcal{O}}, \mathbf{n}_{ (\overline{2})}^{ \mathcal{O}}, \mathbf{p}_{( \overline{2})}^{ \mathcal{O}}]^{\mathrm{-1}}. \end{split} \end{equation}

After further manipulations, the following matrix form containing geometric constraints can be obtained \begin{equation} \begin{split} &[_{ \Phi}^{ \Gamma}\mathbf{R}_{ (:,1)}, \,_{ \Phi}^{ \Gamma}\mathbf{R}_{ (:,2)}, \,_{ \Phi}^{ \Gamma}\mathbf{t}]\begin{bmatrix} x_{ N}^{ \Phi} & x_{ P}^{ \Phi} & \\[0.5mm] & y_{ P}^{ \Phi} & \\ & & 1 \end{bmatrix} = \,\, &\lambda_{ M}\,[\widetilde{\mathbf{n}}_{ (3)}^{ \Gamma}, \widetilde{\mathbf{p}}_{ (3)}^{ \Gamma}, \widetilde{\mathbf{m}}_{ (3)}^{ \Gamma}] \begin{bmatrix} \lambda_1 & & \\ & \lambda_2 & \\ -1 & -1 & 1 \end{bmatrix}. \end{split} \end{equation}

Since $\triangle{M\!N\!P}$ is similar to $\triangle{\widetilde{M}\!\widehat{N}\!\widehat{P}}$, the dot products of the corresponding sides satisfy the following relationships \begin{equation} \begin{split} \frac{d_{ \widetilde{M}\!\widetilde{N}\!\widehat{P}}}{d_{ \widetilde{M}\!\widehat{N}}} &= \frac{d_{ M\!N\!P}}{d_{ M\!N}}, \frac{d_{ \widetilde{M}\!\widehat{P}}}{d_{ \widetilde{M}\!\widehat{N}}} &= \frac{d_{ M\!P}}{d_{ M\!N}}. \end{split} \end{equation}

As a result, two quadratic constraints containing the unknowns $\lambda_1$ and $\lambda_2$ can be constructed.

Direct Expression of Complete Rotation

The complete rotation from world to camera is directly expressed with the vectors and variables, most of which are shown in the above figure. \begin{equation} \begin{split} _{ \Lambda}^{ \Gamma}\!\mathbf{R} = _{ \Phi}^{ \Gamma}\!\mathbf{R} \, _{ \Lambda}^{ \Phi}\!\mathbf{R} = \frac{\lambda_{ M}}{d_{ M\!S}} [\overrightarrow{\widetilde{M}\widehat{N}}^{ \Gamma}, \overrightarrow{\widetilde{M}\widehat{P}}^{ \Gamma}, \overrightarrow{\widetilde{M}\widehat{S}}^{ \Gamma}] \begin{bmatrix} d_{ M\!P} & -d_{ M\!N\!P} & \\ -d_{ M\!N\!P} & d_{ M\!N} & \\ & & \lambda_{ M} \end{bmatrix}\!\! \begin{bmatrix} {\overrightarrow{MN}^{ \Lambda}}^{\mathrm{T}} \\ {\overrightarrow{MP}^{ \Lambda}}^{\mathrm{T}} \\ {\overrightarrow{MS}^{ \Lambda}}^{\mathrm{T}} \end{bmatrix}\!. \end{split} \label{equ:rotationExpression} \end{equation}

Early Verification based on 3D Similarity Error Introduced by the Fourth Point.

The existing P3P+RANSAC frameworks sample three or four point correspondences as a minimal sample in each iteration. We revisit the problem of pose ambiguity removal when sampling four point correspondences per iteration. By introducing the fourth point correspondence, tetrahedrons ${M\!N\!P\!Q}$ and ${\widetilde{M}\!\widehat{N}\!\widehat{P}\!\widehat{Q}}$ are similar as depicted below

Without the pose being recovered, we can use the solved $\lambda_1$ and $\lambda_2$ to calculate a 3D similarity error to remove the ambiguous solutions of two quadratic constraints in advance. We have experimentally verified that this error is as robust as the commonly used 2D reprojection error in P3P+RANSAC pipeline.

Four-Aixs-Rotation Decomposition

By utilizing the coplanarity of $MN$ and $\widetilde{M}\widetilde{N}$, we can decompose the rotation from the object frame to the camera frame into four sequential rotations around frame axes: \begin{equation} _{ \Phi}^{ \Gamma}\!\mathbf{R} = \, _{ \Theta}^{ \Gamma}\!\mathbf{R}_z(\theta) \,\, _{ \Omega}^{ \Theta}\!\mathbf{R}_x(\omega) \,\, _{ \Psi}^{ \Omega}\!\mathbf{R}_y(\psi) \,\, _{ \Phi}^{ \Psi}\!\mathbf{R}_x(\phi). \label{equ:rotationOurExpression} \end{equation} The video below provides an animation of the proposed rotating process.