1 Introduction

The problem of a hidden data filter has been merely demonstrated [5] in the form of problems of an attack aimed at elimination of the watermark and the masking. In [4, 17] such considerations are not found, while in [2, 16] the problem of hidden data filter is estimated in a manner similar to the masking described in [5]. In the book [1] the authors provide examples of elimination of the watermark from watermarked pictures on the basis of a collusion or oracle attack. However, in each of these cases the examples are those of a masking attack, not the perfect removal of a watermark from the watermarked picture, making it possible to return to the host image. An additional element that absolutely must be taken into account is the fact that the perfect hidden data filter must be separated from the robustness of the method which, according to [9, 16], cannot be based on the properties of the algorithm, or access to the watermarking application, but on the general possibilities of transformation of the watermarked picture, both within the space and frequency domains.

Elements of professionally prepared attacks, testing the robustness of watermarking methods have been included in the Stir Mark [8, 14] application. It includes a wide range of methods for potentially preventing the detection of a watermark in a picture containing additional, invisible information. Testing of the possibilities of the watermarking method is performed automatically using a prepared protocol.

However, in literature it is impossible to find algorithms that eliminate the watermark signal from the watermarked signal under the condition of returning to the original signal with high signal quality maintained. The article presents in its first part a theoretical model of the perfect hidden communication filter, an accurate description of both functions – the one eliminating and the one masking the watermark signal, the practical implementation of the hidden communication filter and efficiency tests results.

2 Perfect hidden data filter

2.1 Perfect hidden transmission filter

The description of the perfect hidden communication filter begins with a specification of the communication channel used in the watermarking. Let us consider O as one of the set of all original signals (images, videos, sounds, texts, 3D objects, etc.), W – as one of the set of all watermarks containing information i, \( K \) – as one of the set of all keys used in watermarking (not all watermarking applications require \( K \) keys). Using this notation, it is possible to describe the coding process \( {E}_{wmf} \) which produces all possible watermarked signals \( {O}_{wm} \) and the watermark decoding \( {D}_{wmf} \) as two functions:

$$ {E}_{wmf}\ :O\times K\times W\to {O}_{wm} $$
(1)
$$ {O}_{all}\backepsilon O,\ {O}_{wm},\ {O}_{wm}^{\prime } $$
(2)
$$ {D}_{wmf}:\ {O}_{all}\times K\to empty $$
(3)
$$ {D}_{wmf}:\ {O}_{wm}^{\prime}\times K\to W $$
(4)
  • O all designates one from all possible decoder inputs (in this case tested signals are rejected). It is described in third equation,

  • O wm designates one from all signals watermarked after passing through the communication channel, taking into account possible attacks on the signal – both intentional and unintentional.

Within the decoding function, the supplementary information added to the original signal is found through a comparator function \( {C}_{c\tau } \) that compares the recovered form of the watermark from the signal marked with a particular watermark in relation to the a’priori designated decision threshold τ, which gives a response whether additional information is to be found in the signal or not. Only after a positive comparison in the comparator is information i retrieved from the watermark signal W.

$$ {C}_{c\tau }:\ {O}_{wm}^{\prime }\ \to \left\{0,1\right\} $$
(5)

If we take into account that in the communication channel the form of the watermarked signal \( {O}_{wm} \) can be changed intentionally or not. However, in both cases, conversion of it into signal \( {O}_{wm}^{\prime } \) is performed.

The ideal watermark signal elimination function \( {F}_{c-wm} \) is able to recover original signals \( {O}^{\prime } \) deformed in the communication channel to their approximate original form despite deformation in the communication channel of the watermarked signals \( {O}_{wm}^{\prime } \). In the general case, the purpose of the Filtering Block \( {B_F}_{c-wm} \) is to eliminate the watermark signal W from the deformed watermarked signal \( {O}_{wm}^{\prime }. \)

$$ {F}_{c-wm}:{O}_{wm}^{\prime}\times {B_F}_{c-wm}\to {O}^{\prime } $$
(6)

One must take into account that the problem of eliminating the watermark signal through function \( {F}_{c-wm} \) is a different approach to the removal of the watermark signal W, from masking the watermark signal using a masking function \( {M}_{c-wm} \). Briefly \( {F}_{c-wm} \) can be descripted as an ideal masking function \( {M}_{c-wm} \). This function removes part of the watermark W, in the ideal case:

$$ {F}_{c-wm} \backepsilon {M}_{c-wm} $$
(7)

However, the function \( {M}_{c-wm} \) is characterised by a partial removal of watermark W from the watermarked signal \( {O}_{wm}^{\prime } \), while simultaneously causing degradation \( {O}_{deg}^{\prime } \) of the target form of signal \( {O}^{\prime } \). This stems from the fact that \( {M}_{c-wm} \) interferes with part of the signal \( {O}_{wm}^{\prime } \) in the process of masking part W. An example of such a type of function and its practical implementation is presented later in the paper.

$$ {M}_{c-wm}:{O}_{wm}^{\prime}\times {B_M}_{c-wm}\to {O}_{deg}^{\prime } $$
(8)

2.2 Elimination vs. masking

In the case of deformation of the form of signal \( {O}_{wm} \) into form \( {O}_{wm}^{\prime } \), the decoding function \( {D}_{wmf} \) must have the ability to perform watermark detection W but the crucial element is to recover information i, which, de facto, when used in watermarking, usually contains the copyright data for a particular piece of media. In the case of watermark masking a deliberate attack is most likely, aimed at preventing decoding of information i or watermark detection W. A similar dependency occurs in the case of the eliminating function, but both i and W must be removed, while image \( {O}_{wm}^{\prime } \) must return to the form \( {O}^{\prime } \), a condition that is not critical for function \( {M}_{c-wm} \). Existing considerations in literature [1, 2, 4, 16, 17] have not concerned function \( {F}_{c-wm} \) but only the masking function. In the latter case we find these types of algorithms based on the following boundary conditions for the person conducting the masking:

  • access to the signal,

    • access only to the watermarked signal – most likely case,

    • access to watermarked signals and corresponding information i,

    • access to watermarked signals and corresponding original signals (with the goal being to recover information i),

  • access to the encoder,

  • access to the decoder,

  • access to the watermarking algorithm – both E wmf and D wmf functions.

In literature it is possible to find examples of masking filtering, such as [12], where the authors propose a masking method for decoding watermarking algorithms based on spectral dispersion using non-linear filtering, estimating the watermark in the watermarked signal. In the first part, they filter the watermarked signal using a 3x3-sized median filter. Then they subtract from the watermarked signal the difference between the watermarked signal and the median-filtered signal; the difference has been once again high-pass filtered and empirically scaled on the basis of a determined coefficient. Thanks to this method they estimate the watermark signal and can successfully mask it.

As an example of eliminating filtration it is possible to give an example of collusion processing, where the attacker has only the watermarked signals but in this case it is necessary to satisfy the condition of partial uniformity (masking), or total uniformity (elimination) of the watermark signal W. For this type of filtration only watermarked signals are required, which means that it is a blind method. The process will be based on averaging the watermarked signals – the watermark signal will become clear from among the noise of random values of the remaining averaged samples of watermarked signals. For a watermark signal processed in such a way there is nothing else to do other than to remove the recurring samples of the signal of the watermark W by means of subtraction. This attack applies to watermarking methods in which the watermark signal W added to the original signal O is not its function. The effectiveness of this type of elimination filtration has been shown in [7] for algorithms for watermarking films.

In the case of the person conducting the elimination or masking the watermark signal having access to the watermark encoder, it is possible to perform effective masking of the watermark signal (it should be noted that the person conducting the attack does not have to have the physical encoder, just temporary access to it will enable that person to watermark their own signal, or a couple of original signals), and in a special case – to eliminate the watermark signal. This applies especially to algorithms that use the entire space of the original signal, or, for example, in solid blocks, as acquisition of the watermarked signal makes it possible to determine the spatial or spectral range in which the encoder is operating and establish a solid filtering matrix. Adaptive methods are more resistant to such attacks, where a larger collection of original signals and their corresponding watermarked signals are needed for generalized determinations.

Another particular example of a masking algorithm is described in [11], where the authors prove that in the case of access to a single watermarked signal and a decoder, it is possible to recover a part of the original signal. In this case they use pseudo-linear dependencies used in the detector and it is possible to recover the original signal for a watermark signal without the DC component and within the range of values {−1, 1}. However, it should be noted that the attack in this case is against the watermarking algorithm for broadcast applications, where each user has access to the watermark decoder.

In the case of [10] the authors described a masking algorithm that removes the additional information from the watermarked signal, while maintaining the quality of the original signal, in this case a picture. For algorithm [6] it obtained a PSNR = 36.65 dB, with NC = 0.12, while for [13] a PSNR = 32.95 and NC = 0.28, where these and other attacked watermarking algorithms are of the non-blind type (which greatly limits their use in practical watermarking applications). It should be noted that the masking algorithm quite precisely removes the watermark signal from the watermarked signal, while maintaining good quality of the reconstructed original signal.

3 Implementation of the filter

In the case of article [15] a filter for hidden picture transmission has been implemented and tests of its effectiveness have been conducted. The function of the elimination of the watermark \( {F}_{c-wm} \) begins with the conversion of the picture marked \( {O}_{wm}^{\prime } \) from RGB to YCbCr representation \( {O}_{wm\; YCbCr}^{\prime } \). Analysis is then carried out in the cepstrum domain in order to determine the translation value for the added, luminance matrix with reduced energy, translation in space. For this purpose a 2-dimensional Discrete Fourier Transform is performed on the watermarked luminance matrix \( {Y}_{wm}^{\prime } \):

$$ \begin{array}{c}\hfill {Y}_{wm\ DFT}^{\prime}\left(k,l\right) = {\displaystyle \sum_{x=0}^{X-1}}\left[{\displaystyle \sum_{y=0}^{Y-1}}{Y}_{wm}^{\prime}\left(x,y\right){b}_{YDFT}^{*}\left(l,y\right)\right]{b}_{XDFT}^{*}\left(k,x\right)\hfill \\ {}\hfill {Y}_{wm\ XYDFT}^{\prime } = {B}_{XDFT}^{*}{Y}_{wm\ XY}^{\prime }{B}_{YDFT}^{*T}\hfill \\ {}\hfill {b}_{DFT}\left(k,x\right) = \sqrt{\frac{1}{X}} \exp \left(j\frac{2\pi k}{X}x\right)\hfill \\ {}\hfill {b}_{DFT}\left(l,y\right) = \sqrt{\frac{1}{Y}} \exp \left(j\frac{2\pi l}{Y}y\right)\hfill \end{array} $$
(9)
  • x,y – indexes of discrete spatial positions of pixels,

  • X,Y – spatial resolution of images,

  • k,l – indexes of discrete, 2D signal frequencies of the spectrum.

Then on the matrix \( {Y}_{wm\;DFT}^{\prime } \) a cube of the 2-dimensional autocepstrum function is calculated:

$$ {Y}_{wm\ cepst}^{\prime}\left(m,n\right)={\left( IDFT\left( \ln \left(\left|{Y}_{wm\ DFT}^{\prime}\left(k,l\right)=\right|\right)\right)\right)}^3 $$
(10)
$$ {Y}_{wm\ IDFT}^{\prime}\left(x,y\right) = {\displaystyle \sum_{x=0}^{X-1}}\left[{\displaystyle \sum_{y=0}^{Y-1}}{Y}_{wm\ DFT}^{\prime}\left(k,l\right){b}_{YDFT}\left(l,y\right)\right]{b}_{XDFT}\left(k,x\right) $$
(11)
  • m,n – indexes of discrete coefficients of the two-dimensional autocepstral matrix.

In the degraded watermarked picture the translation coordinates of luminance copies will correspond to the coordinates of the cepstral coefficient for which the cube of the two-dimensional autocepstral function, due to the copy of its own signal, reaches a much higher value, in accordance with [3]. Then, after crossing the decision threshold τ, the coordinates of the cepstral coefficient \( {Y}_{wm\; cepst}^{\prime}\left(m,n\right) \) will be responsible for the inverse translation values of the copy of the luminance matrix \( {p}_x,{p}_y \), while the subtraction or addition sign will be determined by the phase \( {Y}_{wm\; cepst}^{\prime}\left(m,n\right) \):

$$ {Y}_{c-wm}^{\prime}\left(x,y\right)={Y}_{wm}^{\prime}\left(m,n\right)\mp {Y}_{wm}^{\prime}\left(m+{p}_x,n+{p}_y\right)\delta $$
(12)
  • δ – watermark power coefficient, calculated empirically in [15].

Then, for the luminance matrix of the disturbed watermarked picture \( {Y}_{wm}^{\prime } \) processed in such a way, a luminance matrix is obtained with the eliminated watermark \( {Y}_{c-wm}^{\prime}\left(x,y\right) \) which is the same as matrix \( {Y}^{\prime}\left(x,y\right) \). The last step is transforming the matrix from YCbCr to RGB, which results in the output signal \( {O}^{\prime } \).

A masking function \( {M}_{c-wm} \) has also been implemented for recovering the form of the original signal \( {O}_{deg}^{\prime } \) degraded in the communication channel, based on the degraded watermarked signal \( {O}_{wm}^{\prime } \) for cepstral watermarking described in [15] and algorithms that take advantage of functions modulating the added copies of the whole, or component parts of the original signal. The developed masking function uses Wiener blind deconvolution consisting of deconvolution – separation of the watermark signal (which in the general case is de facto imperceptible noise added to the original picture) from the original signal. The diagram of an ideal deconvolution is shown below Fig. 1:

Fig. 1
figure 1

Diagram of an ideal deconvolution system

In practice, it is impossible to perfectly separate two convoluted signals with a system of this nature, so a homomorphic filter is used, eliminating in two cases one of the estimated convoluted signals Fig. 2:

Fig. 2
figure 2

Example of 2 homomorphic deconvolution filter

A diagram of the masking algorithm is shown below Fig. 3:

Fig. 3
figure 3

The diagram of the masking algorithm

A blind deconvolution filter for the luminance matrix of the watermarked signal uses a likelihood maximization algorithm, as a result of which we obtain a filtered luminance matrix \( {Y}_{wiener}^{\prime } \) and a restored estimation of the PSF (Point Spread Function – response from the image system that processes the host image into a source in the form of a spot image) used by the \( {E}_{wmf} \) function. In order to initiate a blind deconvolution function in the most optimal manner (sample of worst case adaptation has been shown at Fig. 11), an adaptive spatial movement filter has been used, with coefficients h calculated as follows:

  • we create an empty matrix for the movement model,

  • we supplement it with a vector with a length of l (f.e. 2 pixels) and angle θ (\( 135{}^{\circ} \)), centered on the middle coefficient of the h filter matrix,

  • for each coordinate (i,j), calculate the closest ND distance between this location of the ND and the segment of the model,

  • \( h= \max \left(1-ND,0\right) \),

  • we then normalize the coefficient h: \( h=h\left( sum\left(h\left(:\right)\right)\right) \) Figs. 4, 5, 6, 7, 8, 9, 10, and 11.

    Fig. 4
    figure 4

    Addition of translated luminance copy with reduced values of coefficients

    Fig. 5
    figure 5

    Original image

    Fig. 6
    figure 6

    Watermarked image

    Fig. 7
    figure 7

    Image without watermark after use of elimination function \( {F}_{c-wm} \)

    Fig. 8
    figure 8

    Image without watermark after use of masking function \( {M}_{c-wm} \)

    Fig. 9
    figure 9

    Original image

    Fig. 10
    figure 10

    Watermarked image

    Fig. 11
    figure 11

    Image without watermark after desynchronized masking function \( {M}_{c-wm} \)

The spatial averaging filter is a matrix with dimensions \( 4x4 \) with a value of the coefficients 0,0625. At the output of the averaging spatial filter the filtered matrix \( {Y}_{median}^{\prime } \) is obtained which is subtracted from the degraded watermarked signal, resulting in matrix \( {Y}_{diff}^{\prime } \) being obtained. It is an estimated watermark matrix used in the function \( {E}_{wmf} \) which we again subtract from the degraded watermarked signal, obtaining as a result matrix \( {Y}_{c-wm}^{\prime } \) that is the approximate luminance matrix of the degraded original signal \( {O}_{deg}^{\prime } \).

4 Efficiency test results

The above-described filter for eliminating hidden communication has been implemented in practice in the Matlab programming environment and its effectiveness has been tested. The number of pictures used as original signals was 99. The experiment consisted in watermarking pictures with a hybrid watermarking algorithm in the section pertaining to the cepstrum domain [15], its detection based on function \( {D}_{wmf} \), which is tantamount to determining the coordinates of the coefficient of the two-dimensional autocepstral function and translation values \( {p}_x,{p}_y \) of the copy of the luminance matrix of the original signal O. Then the PSNR value was measured:

$$ RMS = \frac{1}{XY}{\displaystyle \sum_{i=1}^N}{\displaystyle \sum_{j=1}^M}{\left[O\left(i,j\right) - {O}_{wm}^{\prime}\left(i,j\right)\right]}^2 $$
(13)
$$ PSNR=20{ \log}_{10}\left(\frac{O_p}{RMS}\right)\left[ dB\right] $$
(14)
  • O p – peak value (number of quantization levels for colors).

For the database of 99 pictures, after use of elimination function \( {F}_{c-wm} \) the average value of the PSNR coefficient between the original and watermarked pictures was \( 39,19\; dB \), at \( BER=0\% \), while between original pictures and the ones filtered using the elimination function the \( PSNR=67,88\; dB \), while detection of all watermark signals was not possible (\( {Y}_{wm\; cepst}^{\prime}\left(m,n\right)<\tau \)). The result confirm the effectiveness of the hidden communication filter in the form of function \( {F}_{c-wm} \), in addition in Fig. 7 confirms the return of the form of signal \( {O}_{wm}^{\prime } \) to \( {O}^{\prime } \).

The masking function \( {M}_{c-wm} \) proposed by the authors has been implemented in practice, its effectiveness has been tested using the same database of original pictures, like for the function \( {F}_{c-wm} \). Result of use the masking function is shown at Fig. 8. The efficiency of the watermark masking signal in the form of a percentage of removed information (information was impossible to detect) from the watermarked pictures (the number of degraded original images \( {O}_{deg}^{\prime } \) in relation to the number of photos). Information i inserted into the watermarked signal \( {O}_{wm}^{\prime } \) was generated at random. Table 1.

Table 1 Effectiveness of implemented masking function \( {M}_{c-wm} \)
  • Effect –the efficiency of the masking of the watermarking signal, measured as the ratio between the watermark signals removed from watermarked pictures and the total number of watermarked pictures, maintaining the condition of returning the watermarked picture to the form of the host image.

  • M size – sizes of the matrix of the spatial median filter,

  • l – translation coefficient for the matrix initiating the search for the PSF of the encoding function \( {E}_{wmf} \) of the Wiener blind deconvolution filter,

  • PSNR orygWm – PSNR calculated between the original and the watermarked picture,

  • PSNR orygcwm – PSNR calculated between the original picture and the one recovered as a result of the use of the masking function \( {M}_{c-wm} \).

Taking into account the quality of the recovered host image, the algorithm used \( l=4 \) and \( {M}_{size}=\left[4,4\right] \) as the most optimal values for the method of masking the watermark signal.

In addition, the method contained in [15] was tested using a popular masking algorithm removing embedded content described in [12] and it demonstrated high resistance to this type of processing: BER was only \( 2,04\% \). After executing masking function PSNR between the host images and the masked watermark signal was \( 39,27\; dB \). Tests were performed with the values of the coefficient for scaling non-linear filtering \( A=2 \) recommended by the authors at [12].

5 Conclusions

The article pays particular attention to the problem of elimination and masking of watermarked signals, when it is possible to return the signal containing additional, imperceptible information to its initial, i.e. original, form. In the case of precise removal of the watermark signal, the process is called elimination, while when the signal of the watermark is removed in an approximate manner, it is called masking. Special attention should be paid to the fact that articles concerning watermarking written so far disregard these considerations (there are some, not many, masking algorithms, but they have very limited use). The article presents a developed theoretical model of a function aimed at eliminating the additional, imperceptible information inserted using a spatial algorithm operating in the cepstrum domain. A masking function was also designed. Both functions have been implemented in practice through efficiency testing which confirmed their high effectiveness. The significant robustness of the cepstral algorithm against a popular masking method has also been demonstrated.