Image Deformation is a computer vision task which involves manipulating an image to change its shape, as seen in Figure 1. In such cases, the underlying transformation demands a dense correspondence of pixel coordinates between the source and target image . Mathematically, for every pixel coordinate (x, y),  we need to compute a motion vector (u, v)  such that

Isource(x,y) = Itarget(x+u, y+v) (1)

Where Isource and Itarget represents the source and the target image respectively. It can be realized that the displacement vector (u, v) is different for each of the pixel coordinates.

Figure 1: http://vision.gel.ulaval.ca/~jflalonde/cours/4105/h14/tps/results/project/jingweicao/index.html

Dense spatial transformation involves the parametric transformation of spatial coordinates in which the mapped location  (x’, y’)  in the target image is computed for each pixel (x, y) in the source image. Spatial image transformation depends on the prior knowledge of the underlying deformation . It can be either linear or nonlinear. Linear transformation can be further classified into two variants; affine transformation having deformation along the same plane Vs. projective transformation having deformation across the plane between the source and target. This categorization is well depicted using a block diagram in figure 2. The mathematical representation of various linear and non-linear transformations are depicted below.

Figure 2: Block diagram showing the categorization of spatial transformation

#### Translation: (2)

where t1and t2 represents the displacement of the pixel along X- and Y-direction respectively.

#### Rotation: (3)

where represents the degree of rotation of the pixel in anti-clockwise orientation.

#### Scaling: (4)

Where sxand sy represent the scaling factor along X- and Y- direction respectively.

#### Similarity transformation

It involves translation, rotation, and uniform scaling (same amount of scaling in both X- and Y- direction), and mathematically represented by, (5)

#### Shear transformation

Shear transformation, also known as skewing, slants the shape of an object. It is to be notated that skewing will occur in one direction at a time, while the other coordinate remains the same. Mathematically, it can be represented by either of the following equations with respect to the skewing along X or along Y direction respectively. (6) (7)

#### Affine Transformation

It is a composite linear transformation that includes all possible transformations (translation, rotation, scaling, shearing) along the same plane, represented by, (8)

#### Projective Transformation

Projective transformation is also linear, however, the transformation occurs across two different planes, and hence two additional parameters are required to model the same, given by. (9)

The dense correspondence of pixel coordinates with respect to any of the above linear transformations requires the knowledge of the underlying parameters; for instance, 8 parameters in case of projective transformation, while 6 in the case of affine. It can be realized that prior knowledge of a set of sparse correspondence of control points can be used to solve the above equations, and then the correspondence of the rest of the vertices can be computed using the above equations. The nature of these transformations  alongside the required number of sparse correspondence is listed in Table 1.

Table 1: Details of Linear Transformations

#### Non-linear Deformation using Thin Plate Spline Transformation

Non-linear deformation occurs, when at least one of the source or target bodies is non-rigid in nature. In such a case, linear transformation alone cannot tackle the underlying mapping.  Linear transformations are driven by a very limited number of sparse correspondence, whereas non-linear deformation has no defined degree of freedom []. In general, a non-linear transformation selects a set of handles to control the deformation. These handles may take the form of either triangular mesh, or rectangular grids, or a set of straight lines or a number of control points. In particular, the position and movement of these handles drive the deformation in an intuitive fashion.

Thin plate spline interpolation , also known as TPS, is one such nonlinear transformation that uses the a set of control-point pairs ( source control points and target control points )  to compute the required parameters and then these values are used to find dense correspondence for the rest of the vertices. TPS model the non-linear deformation as a function f that establishes a one-to-one correspondence between the source and target vertices. Mathematically, (10)

Where is the euclidean norm between a control point and a source vertex, represents the radial basis function,

( ) are the non-linear weights associated with each of the control points i , and { } are the set of affine parameters. In particular, we need to compute the values of these 2n+6 parameters with the prior knowledge of the n control point pairs.

#### Role of control points in Image deformation

Image deformation, irrespective of linear or nonlinear, requires prior knowledge of a set of control point pairs to compute the parameters of the spatial transformation and thereafter find the dense correspondence using the underlying equation. In other words, (a) extracting these key points alongside (b) finding their correspondence in the target image is another research domain. The former problem can be tackled using a pose estimation-like architecture  which regresses the joints and corner locations, whereas the later one requires a learning-based geometry matching module to find their mapped locations in the target image.

#### References

 Schaefer, Scott, Travis McPhail, and Joe Warren. “Image deformation using moving least squares.” In ACM SIGGRAPH 2006 Papers, pp. 533-540. 2006.

 Bookstein, Fred L. “Principal warps: Thin-plate splines and the decomposition of deformations.” IEEE Transactions on pattern analysis and machine intelligence 11, no. 6 (1989): 567-585.
 Gong, Wenjuan, Xuena Zhang, Jordi Gonzàlez, Andrews Sobral, Thierry Bouwmans, Changhe Tu, and El-hadi Zahzah. “Human pose estimation from monocular images: A comprehensive survey.” Sensors 16, no. 12 (2016): 1966.