Perspectivity; Extracting and Changing the image in a poster

Here is a python program that uses cv2 to replace the image in a poster. First identify the four corners:

img

Then compute the perspectivity through these four points in the camera image onto a rectangular plane. This will be a projection matrix that can be used to extract the picture from the frame.

import cv2
import numpy
camera_image_points = numpy.array([
    (191,31), # Top left
    (330,27), # Top right
    (328,225), # Bottom right
    (187,217)  # Bottom left
], dtype=numpy.float32)
h = 250
w = 200
remapped_points = numpy.array([
    (0,0), # Top left
    (w,0), # Top right
    (w,h), # Bottom right
    (0,h)  # Bottom left
], dtype=numpy.float32)
M = cv2.getPerspectiveTransform(camera_image_points,
				remapped_points)

print "M is..."
print M
img = cv2.imread("frame_picture.jpg")
inside_frame = cv2.warpPerspective(img, M, (w,h))
cv2.imwrite('inside_frame.jpg',inside_frame)

img

This projection matrix can also be used to replace the image in the frame.

import cv2
import numpy
M = numpy.matrix([[1.68508672e+00, 3.62384241e-02,-3.22974955e+02],
		  [4.32301983e-02, 1.50224939e+00,-5.48266990e+01],
		  [5.08113892e-04, 1.01219194e-04, 1.00000000e+00]])
M = numpy.linalg.inv(M)
img = cv2.imread("frame_picture.jpg")
h = 250
w = 200
cat = cv2.resize(cv2.imread("happycat.jpg"), (w, h))
borderValue = (255,255,255)
cat_remap = cv2.warpPerspective(cat, M, (img.shape[1], img.shape[0]), borderValue=borderValue)
outside=cat_remap[:,:] == borderValue
cat_remap[outside] = img[outside]
cv2.imwrite('catremap.jpg',cat_remap)

img

This projection matrix is a perspectivity, a special case of projection (having only 6 degrees of freedom instead of 8).

There are four lines in P2 (2D perspective space) that go through the origin to intersect the image plane at the previously marked points in the camera image. These lines also touch another plane, somewhere, on which is formed the rectangular inside frame image.

The general form of a P2 projection is:

\begin{equation} \left( \begin{array}{c} x’_1 \\\ x’_2 \\\ x’_3 \end{array} \right) = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\
h_{21} & h_{22} & h_{23} \\
h_{31} & h_{32} & h_{33} \end{bmatrix} \left( \begin{array}{c} x_1 \\\ x_2 \\\ x_3 \end{array} \right) \end{equation}

This relationship defines u and v (the coordinates in the remapped image) in terms of the camera image (x1 and x2 with x3 = 1).

\begin{eqnarray} u &=& x’_1 / x’_3 = (x_1 h_{11} + x_2 h_{12} + x_3 h_{13}) / x’_3 \\
v &=& x’_2 / x’_3 = (x_1 h_{21} + x_2 h_{22} + x_3 h_{23}) / x’_3 \\
x’_3 &=& x_1 h_{31} + x_2 h_{32} + x_3 h_{33} \end{eqnarray}

And finally into a form suitable for solving:

\begin{eqnarray} x_1 h_{11} + x_2 h_{12} + x_3 h_{13} - h_{31} x_1 u - h_{32} x_2 u - h_{33} x_3 u &=& 0 \\
x_1 h_{21} + x_2 h_{22} + x_3 h_{23} - h_{31} x_1 v - h_{32} x_2 v - h_{33} x_3 v &=& 0 \end{eqnarray}

The four remapped points in this example can be combined with the above to create an 8 row, 9 column matrix:

\begin{equation} \begin{bmatrix} 191 & 31 & 1 & & & & -1\times 191\times 0 & -1\times 31\times 0 & -1\times 1 \times 0 \\\ & & & 191 & 31 & 1 & -1\times 191\times 0 & -1\times 31\times 0 & -1\times 1 \times 0 \\
330 & 27 & 1 & & & & -1\times 330\times 200 & -1\times 27\times 200 & -1\times 1 \times 200 \\\ & & & 330 & 27 & 1 & -1\times 330\times 0 & -1\times 27\times 0 & -1\times 1 \times 0 \\
328 & 225 & 1 & & & & -1\times 328\times 200 & -1\times 225\times 200 & -1\times 1 \times 200 \\\ & & & 328 & 225 & 1 & -1\times 328\times 250 & -1\times 225\times 250 & -1\times 1 \times 250 \\
187 & 217 & 1 & & & & -1\times 187\times 0 & -1\times 217\times 0 & -1\times 1 \times 0 \\\ & & & 187 & 217 & 1 & -1\times 187\times 250 & -1\times 217\times 250 & -1\times 1 \times 250 \end{bmatrix} \times \left[ \begin{array}{c} h_{11} \\\ h_{12} \\\ h_{13} \\\ h_{21} \\\ h_{22} \\\ h_{23} \\\ h_{31} \\\ h_{32} \\\ h_{33} \end{array} \right] = 0 \end{equation}

This pseudo-inverse of this matrix can be found using singular value decomposition.

import numpy
A = numpy.matrix([[191,  31, 1,   0,   0, 0,      0,      0,    0],
		  [  0,   0, 0, 191,  31, 1,      0,      0,    0],
		  [330,  27, 1,   0,   0, 0, -66000,  -5400, -200],
		  [  0,   0, 0, 330,  27, 1,      0,      0,    0],
		  [328, 225, 1,   0,   0, 0, -65600, -45000, -200],
		  [  0,   0, 0, 328, 225, 1, -82000, -56250, -250],
		  [187, 217, 1,   0,   0, 0,      0,      0,    0],
		  [  0,   0, 0, 187, 217, 1, -46750, -54270, -250]])
ATA = numpy.matmul(numpy.transpose(A), A)
U,s,V = numpy.linalg.svd(ATA)

# Find the first value in s that's close to zero.
UT = numpy.transpose(U)
M = None  
for i in range(0,9):
    if s[i] < .000001:
       # Divide out the last value, which is a scaling factor.
       M = numpy.reshape(UT[i]/UT[i,8], (3,3))
       break

print M

References

  • Richard Hartley and Andrew Zisserman. 2003. Multiple View Geometry in Computer Vision (2 ed.). Cambridge University Press, New York, NY, USA.