Feature Extraction - Proposed method - Image Forgery Detection and Localization

2.2 Proposed method

2.2.3 Feature Extraction

A large number of features have been proposed in recent years for the pur-pose of copy-move detection, and many of them have been considered in the extensive experimental comparison carried out in [29]. Here, we will focus on features based on the family of Circular Harmonic Transforms (CHT) [58]

which possess desirable invariance properties.

Let I(x, y) be a scalar image defined on a continuous space, (x, y) ∈ R², and let I(ρ, θ) be its representation in polar coordinates, with ρ ∈ [0, ∞] and θ∈ [0, 2π]. The CHT coefficients are evaluated by projecting the image over

2.2. Proposed method 35 the basis functions Kn,m(ρ, θ)of the transform

FI(n, m) = The basis functions have the form

Kn,m(ρ, θ) = Rn,m(ρ) 1

√2πe^jmθ (2.19)

that is, they are obtained as the product of a radial profile Rn,m(ρ)and a cir-cular harmonic. Therefore (2.18) can be rewritten as

FI(n, m) = The integral in square brackets, let us call itI(ρ), is the Fourier series of I(ρ, θ)b along the angle coordinate. Therefore, a rotation of θ0radians in I contributes just a phase term e^jmθ⁰ inIb, which disappears if one takes the magnitude of the coefficients, thereby obtaining rotation invariance.

The various CHTs differ in the radial profile. We consider three choices, the Zernike Moments (ZM) [97], the Polar Cosine Transform (PCT) [105], and the Fourier-Mellin Transform (FMT) [93]. Zernike radial functions are defined as

for ρ ∈ [0, 1], with C^n,m,h suitable coefficients that ensure orthonormality of the basis functions. Fig.2.6(a) shows some of these functions, chosen among those with lowest order. In the PCT, the radial functions are just cosines with argument ρ²,

Rn(ρ) = Cncos(πnρ²) (2.22)

limited again to ρ ∈ [0, 1], with normalizing coefficients Cⁿ, some of which are shown in Fig.2.6(b). In the FMT, instead, they are defined as

R_ν(ρ) = 1

ρ²e^{jν ln(ρ)} (2.23)

Notice however that, in this case, the functions are non-zero for all ρ ≥ 0, they diverge at the origin, and the parameter ν is continuous-valued. With this choice, the integral (2.20) becomes just the Fourier transform of I(ρ)b

36 2. Efficient dense-field copy-move forgery detection

after a coordinate remapping, while the whole FMT can be regarded as the bi-dimensional Fourier transform of I in log-polar coordinates. As a conse-quence, a scale change in I contributes only a phase in the FMT coefficients, which disappears after taking the absolute value, granting also scale invariance.

Now, we have to translate these theoretical definitions into practical finite-length features which characterize locally the image. These must be computed on the available data, sampled on a discrete grid, preserving the invariance properties. To this aim, we have to select a finite number of (n, m) couples, define a suitable patch size and, for each pixel s, compute the FI(s)(n, m) coefficients, by approximating the integral of (2.18) with a summations over the patch centered on s. Eventually, the feature f(s) will be the collection of the magnitudes of these coefficients.

The patch size must guarantee a good compromise between discrimination and robustness. Patches too small might not catch the local image behavior, while if too large they might loose resolution and lead to false alarms. Like-wise, features should not be unnecessarily long, to avoid slowing down all processing steps, but still expressive enough to allow correct matches. We will not indulge in describing the preliminary experiments carried out to set these quantities, selected values are reported in Tab.2.1. Patches of 16×16 or 24×24 pixels are used, with features of length 12 for Zernike, 10 for PCT, and 25 for FMT, corresponding always to the lower order²basis functions.

Let us focus, instead, on the approximation of the integral (2.18). A straightforward solution is to resample the basis functions Kn,m(ρ, θ)on the grid points (x, y) of the analysis patch, W , where the image is defined (see Fig.2.7(a)), computing therefore viable solution is to resample the image on polar (or logpolar) coordinates (see Fig.2.7(b)-(c)), and compute

F_I⁰⁰(n, m) = X

(ρ,θ)∈W

I(x(ρ, θ), y(ρ, θ)) K_n,m^∗ (ρ, θ)ρⁱ (2.25) with i = 1 for the polar grid and i = 2 for the logpolar one. This seemingly minor difference has non-negligible consequences on performance, in partic-ular on rotation invariance [102]. In fact, polar sampling guarantees perfect

2For FMT, ν = 2nπ/ log(ρmax/ρmin), for n = 0, ±1, ±2.

2.2. Proposed method 37

(a) (b) (c)

Figure 2.7:Examples of rectangular (a), polar (b) and log-polar (c) sam-pling grids.

invariance for rotation angles multiple of the sampling step ∆θ, and a good approximation of it in all other cases, provided ∆θ is not too large. On the contrary, with rotation angles close to π/4 ± kπ/2, features computed on the cartesian grid can change significantly, undermining the invariance property, as also shown in [72] where an accurate analysis of errors induced by sampling is carried out.

In addition, the two solutions have the same computational efficiency, since, given ρ and θ, the interpolated values I(ρ, θ) in (2.25) are computed from available data points with fixed weights, falling back again to a filter-ing of the form (2.24), only with different weights. We will therefore resort to the polar sampling for both Zernike and PCT features, but keep also the cartesian sampling as reference. For FMT, instead, we will obviously use a log-polar sampling, aiming at scale invariance. However, we are forced to ex-clude points too close to the origin [115], that is to the central pixel s, where the radial functions diverge.

We note explicitly that CHT-based features have been already used for forgery detection. Zernike moments, for example, have been adopted in [92, 29, 91], with cartesian sampling, providing interesting results. Likewise, the PCT has been investigated in [71], again with cartesian sampling. As for FMT-based features they have been also already considered for forgery de-tection [11], but with unimpressive results, as reported in [29]. However, the implementation proposed in [11], inspired by [73], includes further process-ing steps that disrupt the invariance properties, so useful for robust copy-move detection. Similarly, in [100], the features are formed by taking some cross-spectra, rather than the magnitude of the coefficients themselves.

38 2. Efficient dense-field copy-move forgery detection

Figure 2.8:Three forged images with different levels of activity from the FAU database. From top: smooth, rough, structured.

2.3 Experimental evaluations

In this section, we present the results of a number of experiments carried out in order to fine tune the proposed technique and assess its performance w.r.t.

the state of the art. In order to guarantee reproducibility of results, our code is available online³, and experiments are carried out on two databases also available online. The database used in [29], which we will call FAU⁴from now on, comprises 48 images with realistic copy-move forgeries, some examples of which are shown in Fig.2.8, classified as smooth, rough or structured. These images are quite large, with typical size 3000×2400 pixels, with tampered areas covering about 6% of each image, on average. We prepared a further database⁵ composed by 80 images, again with realistic copy-move forgeries,

3http://www.grip.unina.it

4http://www5.cs.fau.de/

5http://www.grip.unina.it

2.3. Experimental evaluations 39 some of which are shown in Fig.2.9. All these images have size 768×1024 pixels, while the forgeries have arbitrary shapes, aimed at obtaining visually satisfactory results, with size going from about 4000 pixel (less than 1% of the image) to about 50000 pixels. In adding this new database, called GRIP from now on, we wanted to enrich the experimental set available to the community, and include also forgeries of relatively small size. However, we were also motivated by the practical need to run in a reasonable time the large number of experiments needed to fine-tune and validate the proposed technique in various situations of interest.

Results are provided both at pixel level and image level. To assess synthet-ically the image-level performance we use the F-measure, defined as

F = 2 TP

2 TP + FN + FP (2.26)

where TP (true positive), FN (false negative), and FP (false positive) count, respectively, the number of detected forged images, undetected forged images, and wrongly detected genuine images. Similar definitions are used at pixel-level for each image to obtain, after averaging on all images, the pixel-pixel-level F-measure. At image level we measure, therefore, only the ability to correctly recognize an image as forged or genuine, while the pixel-level measure ac-counts also for localization accuracy. At pixel level we exclude from compu-tation the pixels at the boundary between forgery and background, where the transparency is set to an intermediate value between 0 and 1 to avoid artifacts.

Processing time is another key performance parameter, since reliable copy-move detectors are known to be rather slow, a non-negligible problem with images that become larger and larger as technology goes on. We will therefore report also the average CPU time per image, measured on a computer with a 2GHz Intel Xeon processor, operating in single-thread modality.

Next subsection is devoted to analyze the proposed technique, while the subsequent one compares performance with the state of the art.

In document Image Forgery Detection and Localization (Page 44-49)